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THE EFFECT OF DIFFICULTY AND CHANCE SUCCESS ON 
CORRELATIONS BETWEEN ITEMS OR BETWEEN TESTS 


JOHN B. CARROLL 
ENS., H(S), U. S. N. R.* 


A study is made of the extent to which correlations between 
items and between tests are affected by the difficulties of the items 
involved and by chance success through guessing. The Pearsonian 
product-moment coefficient does not necessarily give a correct indica- 
tion of the relation between items or sets of items, since it tends to 
decrease as the items or tests become less similar in difficulty. It is 
suggested that the tetrachoric correlation coefficient can properly be 
used for estimating the correlation between the continua underlying 
items or sets of items even though they differ in difficulty, and a 
— for correcting a 2 X 2 table for the effect of chance is pro- 
posed. 


The correlation coefficient has frequently been used as an indica- 
tion of the extent to which two items or two tests measure the same 
ability. It is the purpose of this paper to show that correlations be- 
tween items and between tests are affected by the difficulties of the 
items involved, and that to the degree that the items or tests are dis- 
similar in difficulty, conventional correlational statistics tend not to 
give a correct indication of the true overlap of ability. This demon- 


stration is made through the analysis of a theoretical limiting case, .. 
that is, one in which a set of items measure a single ability, but it - “ 


can also be shown to.apply whenever items have any factorial overlap. 

Assume that we have a set of » perfectly reliable items of vary- 
ing difficulty which all measure a single ability and only a single abil- 
ity. Difficulty is measured by the proportion (k) of individuals fail- 
ing each item. It is assumed that in order to pass an item the in- 
dividual must have true mastery of the task involved; that is, the 
probability of chance success, c , is assumed to be zero. It is further 
assumed that each item has been presented to all individuals under 
constant conditions.+ The ability measured is of such a nature that 
success at any level of difficulty implies success at all lower levels of 

* The opinions expressed in this article are the private ones of the writer 
and are not to be construed as official or reflecting the views of the Navy Depart- 
ment or the naval service at large. The writer is indebted to Lt. C. L. Vaughn, 
H(S) USNR, for critical comments on this paper. 


+ This assumption precludes the analysis of a set of items in a time-limit 
test where the subjects are exposed to varying numbers of items. 
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difficulty, and failure at any level implies failure at all higher levels 
of difficulty.* Thus, it will be found that all individuals who pass an 
item at any given level of difficulty will pass all items of less diffi- 
culty, and that all individuals who fail an item at any given level of 
difficulty will fail all items of greater difficulty.7 


Statistics of a Set of n Items Varying in Difficulty 


For convenience in later formulations, all statistics will be devel- 
oped in terms of failure scores, 7.e., where failure on an item is scored 
as 1, passing as 0. 

Let E; be the failure score of an individual on item 7 in a test 


of n items. Let E = » E;; that is, E represents the total score of an 


individual on the test of » items. N will signify the number of in- 
dividuals in the population. If we let k; represent the proportion of 
individuals failing item 7, we find 


N 
DE 
ki 2. be eRe | (1) 
N 
and that the mean score on the test of items is written as 
ii 1 Nn n 
E=-—DSDSEi=)>D ki. (2) 
N i=. 


The sum of squared test scores is found as 


N N 
> E?=D[E, + BE. + Ey +++ By + Bj +++ Ep)? 
=DSF24+ D3 £24+ 3 £24+---+ 3 EE? + SE; (3) 
+--+S#E/Z+2D5E;E;, 
147 
where i refers to the easier of a pair of items, 7 to the harder of the 
pair. Subscripts 1, 2,3,---,%,7,---, m refer to items. Since all 
scores are either 1 or 0, © E;* = D> E;; therefore, the sum of the 
squared expressions in (3) can be written merely as NDk;, by (2). 
The expression 2>E,E; represents the cross-products of. scores on 
pairs of items. Only those failing both items of a pair will contribute — 
a term other than zero to the sum SE;E;. In accordance with our 
* Most factors of ability, but not necessarily all, are probably of this nature. 
+ If one moves out of the context of these assumptions, however, the fact 


that any given pair of items is characterized by such a relation does not guaran- 
tee that the items are factorially homogeneous. 
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assumptions, the number of individuals who fail both items of a pair 
is given immediately by the number who fail the easier item, since 
none of these pass the harder item. If all m items are ranked in dif- 
ficulty, 

DH, E;=NINUK, (4) 
where n, = the number of items ranked above item 7 in order of in- 


creasing difficulty. The expression 7, is intended to include all items 
ranked above 7 even when equal in difficulty to item 7. Then, 


SE?=NTK+2NSnk;. (5) 


Substituting values from (5) and (2) in a standard formula for the 
standard deviation, one obtains 


or = VSKi +25, ki — (ei)* > ©) 





If all n items are of the same difficulty k, formulas (2) and (6) be- 
come, respectively, 


E=nk (7) 
and 
og —nVk(1—k). (8) 
For one item, formulas (7) and (8) become 
E=k; (9) 
og = VEC — k). (10) 


Correlations Between Sets of Items Whose Difficulties Are Known 


It is now possible to find the expected Pearsonian correlation co- 
efficient between two sets of items whose difficulties are known. Both 
sets of items are assumed to measure a single factor of ability, and 
the probability of chance success is zero. 

Let E, = an error score on item set 1, and #, = an error score 
on item set 2. Then let 


E,=E,+ E:.. (11) 


In this and in subsequent formulations, we shall let N = 1 for 
the purpose of simplification. This means that score frequencies and 
scatterplot cell frequencies are expressed as proportions of the popu- 
lation. (Even if N is left explicit, it drops out in the final formulas.) 
Then we may write, by (2), 


SE.=3h+3h, (12) 


OE A cee, eR cee . 


Se Se 


pe 


Besse ress 
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and by (5), 
DEPH=Tkh+2Smk. (13) 
In determining values of n, in (13), all items in both sets are ar- 
ranged together in order of decreasing difficulty. In (12) and (18), 
the subscripts of k identify the & values as belonging to set 1, set 2, 
or the total (¢) of the 2 sets. Squaring and summing equation (11), 
we obtain 
> EF =D £2+ D> E2+2D4,E,, 
or 
> £, E2=3(S E? — > £,? — > E,). (14) 
Substituting expressions obtained in equations (5) and (13), we have 
DS £,E.=41(135 kb: +235 uk: —- Sk —- 2S ak — Ske — 2D Nek) 
15 
= S1.k: -Tukh-Tuk. ( 


The correlation 7,2. can be found by substituting the required values 
in the formula 


a E, E. — (> E;) (> E.) 
ua ’ (16) 


VLE EY — (2 E,)?] [2 Be? — (2 E2)?] 





which gives 
= = Na ky — YJ Me hy — J Ma he — (S hr) (DS Kee) 


 VISh F252 k — (Sh) Sh + 2S2 e — (SIe) 
(17 








Formula (17) shows that the correlation between two sets of items 
all of which measure a single ability can be written directly in terms 
of the number and difficulty of the items. 

A number of interesting special cases of formula (17) can be 
written more simply. When all items of set 2 are harder than any 
item in set 1, the numerator of (17) can be written as 5 k,(n.— > ke). 
When each item in set 1 is matched by an item of corresponding dif- 
ficulty in set 2, it will be found that the numerator of (17) becomes 
equal to the denominator. Hence, 7. in this case is equal to unity. 
This result is not surprising since the failure scores for the two sets 
of items should, in view of the assumptions outlined, exactly corre- 
spond. 
When each set consists of only one item, we are actually dealing 
with the correlation of two items measuring a single ability. Let the 
subscript i refer to the easier item and j to the harder item. Still 
using failure scores and letting N = 1, we note that the cross-prod- 
uct term © E; E; is equal to the proportion of individuals who fail 
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the easier item: 
SEE; =k;. (18) 


The formula for the Pearsonian correlation coefficient for two items 
measuring a single factor can be developed, using formulas (9) and 
(10), as 





_ SEE; - E;E; 
ees O75 9; 
ki aed ki; k; 


= (19) 
Vk kj (1 — ki) 1 — &) 


= [ee — kj) 
NEG ki)" 








Except for differences in notation, formula (19) is identical with a 
formula presented by Ferguson* as giving the value expected for the 
maximum correlation between two items homogeneous in content but 
not in difficulty. As Ferguson points out, formula (19) gives values 
of less than unity unless the items are equal in difficulty. 

When each set consists of items of uniform difficulty, the error 
score for each set can be obtained by multiplying the error score for 
one of its component items by the number of items in the set, since 
all items in each set are passed or failed together. Such multiplica- 
tion by a constant does not alter the relationship specified by formula 
(19) for any pair of items selected one from each set. Formula (19) 
therefore applies to this case, k; and k; referring to the difficulties 
of items in the easier and the harder sets, respectively. 


Numerical Illustration of Formula (17) 


In order to make concrete the operations involved in formula 
(17) and to relate them to conventional statistical procedures, a nu- 
merical example is given in Table 1. The analysis by formula (17) 
is shown in the left-hand portion of the table, while the conventional 
treatment is given at the right. Assume that we wish to find the ex- 
pected correlation between two sets of items all of which measure a 
single ability. Set 1 consists of items with k values (proportions fail- 
ing the item) of .45, .45, .45, .45, and .10, respectively ; set 2 has items 
with k& values of .95, .85, .85, .20, and .20, respectively. In Table 1 at 
(A), these items are arranged in order of decreasing difficulty, and 


* Ferguson, G. A. The factorial interpretation of test difficulty. Psycho- 
metrika, 1941, 6, 323-329. 
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TABLE 1 


Numerical Illustration of Formula (17) for Two Sets of Items 


(A) 
Computation from item difficulties 
Set 1 Set 2 
' k, n 
95 
85 
85 
.20 
.20 


Zk, = 3.05 
=n, k, = 3.95 
E, = 3.05 
0,2 = 1.6475 


(B) 
Combination of set 1 and 2 


a 


nN 
0 
1 
2 
3 
4 
5 
6 
7 
8 
9 


2k,= 4.95 
=n, k, = 14.55 


By formula (17), 
49 = 6269 . 














(C) 

Computation from score distributions 
Set 1 Set 2 

E f 

"(55 

.00 

.00 

.00 

35 

.10 


Weo=1 

= E,=1.90 
= E,2=8.10 
E,=1.90 == 


BSSaS5R~ 


N=1 
2 E,= 3.05 
= E,? = 10.95 
3.05 
1.6475 


0,2? = 4.49 o,2 = 
(D) 
Combination of set 1 and 2 


E, f 
05 
10 
00 
40 
00 
00 
00 
25 
00 
10 
10 


N=1 


ZE,= 4.95 
> E,? = 34.05 


SCM WONATRWON FEO 


fy 


By (14), 


2 E, E, = 4(34.05 — 8.10 — 10.95) 
== 7.60. 


By (16), 


r 


12 


7.50 — (1.90) (3.05) 








V [8.10 — (1.90)2][10.95 — (3.05)2] 
= .6269. 
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the values 5 kh, , 5 ke, 5 mk, , and > n, k, are found. In order to find 
> n. k; , the items are completely rearranged in order of difficulty, as 
shown at (B), and the difficulty values are multiplied by the new val- 
ues of m,. The correlation coefficient is obtained by‘substitution in 
formula (17). At (C), the score distributions for: the two sets are 
presented. The frequencies are expressed in proportions; and can be 
found as follows: the frequency of a failure score of zero‘is the pro- 
portion passing the hardest item; the frequency of a failure score of 
n is the proportion failing the easiest item; and the frequencies of 
the intermediate failure scores are the differences between the pro- 
portions failing adjacent items when the items are arranged in order 
of difficulty. At (D), the score distribution for the total failure score 
on set 1 and 2 is given, and the correlation coefficient is found by con- 
ventional methods after the term > EE, has been found by (14). 
Table 2 shows the correlation surface for the two sets of items in 


TABLE 2 


Scatter-Diagram of Failure Scores on the Hypothetical Item Sets 1 and 2 Treated 
in Table 1. Cell values are proportions of the total population (N.= 1) 








Failure Score, Set 1 
2 8 4 


10 








Failure Score, Set 2 
3| $33 3:33 








r = .6269 


terms of proportions obtaining given score-combinations. The prod- 
uct-moment correlation coefficient obtained here is again .6269. 

We have now analyzed the theoretical limiting case where all 
items measure a single factor. Relationships similar to those demon- 
strated here also hold in the case where we have two tests whose un- 
derlying continua are not perfectly correlated. The true correlation 
between two such tests can be expected to be obscured to some extent 
by differences in the difficulty levels of the two tests. Space does not 
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permit giving a complete demonstration of this fact, but it may be 
noted that given the side entries (frequencies) for any correlation 
table, the maximum positive product-moment correlation coefficient 
may be determined by considering that these entries fulfil the condi- 
tions of formula (17) and evaluating the formula. Thus, for example, 
given the side entries of Table 2, it is impossible to write a correlation 
table which will yield a correlation higher than .6269. 


Statistics of a Set of Items When the Probability 
of Chance Success (c) Is Greater Than Zero 


The formulations given thus far can be developed in such a way 
as to be applicable to the case where the probability of chance success 
is greater than zero. For example, the items may be cast in such a 
form that the individual may choose between several alternative re- 
sponses. The probability of chance success (c) may if desired be 
determined on @ priori grounds as the ratio of the number of correct 
alternatives to the total number of alternatives, but the method of 
determining c is irrelevant to the formulations given here. We shall 
set d= 1 — c, for the sake of simplicity in certain expressions. 

We shall first find the relations between (1) distributions of 
“true” failure scores unaffected by chance success and (2) distribu- 
tions when the failure scores are affected by chance success. In the 
following, we do not need to assume that the items measure a single 
ability. However, it will be assumed that every subject who does not 
truly know an item will guess, 7. e., choose one of the alternatives. 

Let us suppose we have a set of 5 items which are heterogeneous 
in difficulty and which are subject to passing by chance success. Let c 
be uniform for all items. Let f; = the frequency of a true failure score 
unaffected by chance success. In Table 3, the actual failure scores of 


TABLE 3 


Frequency Distributions of Actual Failure Scores (EF) 
Made by Those Obtaining Each True Failure Score (£) 














True Failure Score (FE) 
E. 0 1 2 3 4 5 2 
0 fe cf, cf, ef, ef, of, 
1 df, 2cdf, 8e2df, 4c°df, 5etdf,, 
2 df, 3cd?2f, 6c%d*f, 10c8d2f, 
3 d*f, 4ed®f, 10c?df, 
4 d‘f, 5ed*f 
5 df, 
* fo f, fe fs f, Ie N 
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those subjects with each true failure score are classified on the basis of 
the binomial theorem. For example, as shown in the column headed 0, 
those who make a true failure score of 0 do not guess and therefore 
make actual failure scores of 0. Those who make a true failure score 
of 1 have a chance c of passing the one remaining failed item to 
make an actual failure score of 0. The remainder, df, , make an ac- 
tual failure score of 1. The actual failure scores of those making a 
true failure score of 5 are distributed as shown in the column headed 
5. 

In order to find the mean (E£,) and the standard deviation (o,) 
of the distribution of actual failure scores, we find the values of the 
sums } E, and > £2. We can find these values for each column of 
Table 3 and sum over all the columns. 

For any column of Table 3, 





SE.=fel0-c8 +1-E cd +2. tet. + EQ] 
= df, Ble + (B-1er2a+ So ae ose t. +aPa] 
=dfrE. (20) 
Summing over all columns, in the general case, 
SE.=adSfrEF 
=dSE. — 
To find 5 E22 , we note that for any column of Table 3, 
E(E — 1) 


BPS lO + PS Se ae 





(E - 1) (E-2) ,, 
1-2 


=df,E[c+2(F—1)c®*d+3- @ 
++--+E d®], 
or, 
SE2=d fy E{ 0-c% + [ce] + (E—1)c%*d + [((B— 1) ] 
(E — 1) (EB — 2) (E — 1) (E — 2) 
2. oat | ome 
1-2 
+ ..+ (E—1)d® 4+ [d?]}. (22) 
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In (22) the sum of the expressions within the braces but not in the 
square brackets can be found by a method similar to that employed 
in simplifying equation (20). The expressions within the square 
brackets represent the expansion of (c + d)*=1. Hence, for a 
given column of Table 3, 


> E2=df, E{d(E —1) +1] 


: 23 
=cdfpE + df, E?. sie 
Summing over all columns, we have 
DEZ2Z=cdSfrE+@Sf,E 
(24) 


=edSE+@SE. 


Formulas for the mean and standard deviation of failure scores af- 
fected by chance can now be written in terms of the statistics of the 
distribution of “true” scores unaffected by chance, as follows: 


E.—dE: (25) 





a eT (26) 


The Correlation between Actual Fatlure Scores on Two Sets of Items 
Affected by Chance Success When the Distributions of True 
or Non-Chance Scores Are Known 


By (21) and (24) we can show the effect of chance success on 
the combined scores for set 1 and set 2. Let E., = E. + E... By 


(21), 
SE. =ad3SE,, (27) 


and by (24), 
LEZ =cdZE, + > E;,. (28) 


By analogy with (14), 
TE. Ee, = SE? - LE2Z-TEZ). 
Substituting (24), expanding, and simplifying, we obtain, 
LE. E..= CLE, E:. (29) 


The expected correlation between E.. and E., can be found by sub- 


stituting the appropriate values in a formula for the Pearsonian prod- 
uct-moment correlation coefficient, as follows: 
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SE. Ee, 
-E. E. 
1 2 





oe. 
SE, E, = = 
—— - WE, E, 
N 
= - (30) 
/ [d? o,2 + cd E,][d? «2 + cd E.] 








dry. 01 02 








i @ 0,7 o.? + od Ey o.? + ed E, 0? + cE, E, 


Formula (30) holds whether or not the items measure a single ability. 


Correcting Correlations for Chance Success 
It may be of practical value to have formulas (25), (26), and 
(30) expressed in such a way that the “true” values E’, og, and 1x2 
can be estimated directly from the empirical values F, o-, and Tec, 
The required formulas are as follows: 


di E. 
E=>—; (31) 
d 
Vo? one cE, 
og = ————_; (32) 
d 


Tee, oe, Te, 
yo S ; (33) 
Vo to cE. cc ck, Fs +c’E. E 

1 2 1 2 2 1 1 








3 
Thus, if we obtain the correlation between two multiple-choice tests 
of ability in which the subjects have responded to every item, we can 
estimate by (33) the correlation which would exist between the tests 
if the items were perfectly reliable and if the items were passed only 
by true mastery. 

Where we have two equivalent forms of a test, i. e., where 
E. = E., and o- =-,, formula (33) becomes 

Toe ae" 


12 


eee, (34) 
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Effect of Chance Success on Sets of Items Measuring a Single Ability 
We now return to the analysis of items measuring a single abil- 
ity. Formulas (25), (26), and (30) can be written directly in terms 
of the true difficulties (& values) of the items concerned by substitut- 
ing formulas (2), (6), and (17), respectively: 
E,=dE=d>k;; (35) 
oo = a? og? + cdE 
=d@Dk+2@Dd3 nk, — BS k)?; 


d 12 01 G2 


(36) 





T.2. 


: * VP otet + cedE, 02 +cdE,o2+ cE, E, 
ALS the kee — Stee kis — SB tha bn — (Sha) (Zh) I 
f1Sh + 2d SM, k— US Hh) Se FaTS Mh — US) 
(37) 
Special cases of formula (37) can now be written by introducing 


modifications of formula (17) as before. When all items of set 2 are 
harder than any item in set 1, 


dS ky (tm — & ke) 


J heme . 
2 Y(Sk + 2dd 0, k, — a(S kh) VS he + 20S My he — d(S bee)*] 
(38) 
When each item in set 1 is matched by an item of corresponding dif- 
ficulty in set 2, 7,. = 1.00, ZH, = E,, and o, = «2. Hence, 
d[(Sh +25, k, —(S k,)?7] 
rt = . (39) 
12 Sk, +2dS nk, —d(Sk)? 


When each set consists of items uniform in difficulty, we have for any 
one of the sets, by (25) and (7), 




















E.=dE=dnk; (40) 
and by (26) and (8), 
og=@ortcdE 
=@#n?k(1—k) +ednk. 


By (29), and in view of the fact that the items are uniform in dif- 
ficulty, 


(41) 
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> Ee, E., => £,4,.=C@¥#%>Dhi=C?n nk, ' (42) 


where N = 1 and the subscript 2 refers to the harder set of items. 
Hence, | 


‘ d? Ny 2, Kk, — (dM, ky) (d Me Kea) 
“Tae = &) + OOM, Ee] le a? (1 — Fe) ¥¢d tm ka] 














4 
ne d m, Nz i, (1 — ke) ~ 
Tu ke + dn, — dn, ke] lm ka (0 + am, — TN 
Where n, = n, formula (43) becomes 
dn k,(1— ke) Os 
Teor = FY (44) 








“2 kj kale + dn —adnk,) (c +dn—adnks) 


Further, when k, = k., that is, where the tests are of exactly the 
same difficulty and may be regarded as “equivalent” forms, 
dn(1—k) ‘i 
Ve . 

12 ¢+dn(1—k) a 
It can be shown that the correlation estimated by (45) varies in ac- 
cordance with the Spearman-Brown formula for lengthened test-re- 
liability. Let » be the number of times a test of 1 items is increased 
in length; let 7 be the reliability of a test of length »m. BY the 
Spearman-Brown formula, 





Y Tan 
1+ (y—1)Tnn 
Substituting formula (45) for 7a, , we obtain 
yan(1—k) 
c+dn(1—k) 
dn(1—k) 
c+dn(1—k) 


_ vdn(1—k) 
e+yvdn(1—k)- 


If we consider n as a variable quantity, »n is merely a particular 
value of n; hence (46) is essentially equivalent to formula (45), and 
it is shown that the reliability of a test as estimated by formula (45) 
varies as a function of the length of the test according to the Spear- 
man-Brown prophecy formula. 





Tw = 








Tyy = 


1+ (»—1) 





(46) 
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When each set consists of one item, formula (44) becomes 
d k,(1 — ke) 
112 — . 
¥ ky, ke(1—dk,) (1—d k) 


(47) 








Application of the Tetrachoric Correlation Coefficient 


Thus far all formulations presented here have been in terms of 
the Pearsonian product-moment correlation coefficient. There is con- 
siderable question as to whether this statistic is applicable in all cases 
which have been treated. Its use in evaluating the relation between 
two items appears to be inappropriate in view of the broad categories 
involved. The Pearsonian coefficient affords a means of estimating 
the efficiency of prediction of the score on one item from the score on 
another item. If the primary concern is not with prediction, however, 
but with the factorial relation between two items, the Pearsonian co- 
efficient does not give a correct indication of this relation because 
even where sets of items measure a single ability, the correlation co- 
efficient varies widely as a function of the difficulties of the items. 


FIGURE 1 
Abac Showing the Expected Tetrachoric Correlations Between Items Meas- 
uring a Single Ability, When c; = c; = .50. Values above the diagonal line are 
meaningless. 
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Numerical evaluation of the various formulas presented in this paper 
will demonstrate that (a) the obtained correlation coefficient of tests 
or items decreases as the tests or items become less similar in diffi- 
culty; and (b) other things being equal, the obtained correlation of 
pairs of items decreases as their average difficulty becomes greater. 

The tetrachoric correlation coefficient may be applicable to some 
of the cases which have been treated. Given the frequencies in a 
2 X 2 table, we can estimate by means of 7c:, the degree of correla- 
lation represented by the best normal correlation surface fitted to 
these frequencies. 7;.:, is thus an estimate of the correlation between 
the continua underlying two items or two sets of items. When chance 
success is not operating, and when the items measure the same ability, 
Tterr Will always be unity, because however the variables are dichoto- 
mized, one cell in the 2 X 2 table will be vacant. r,.:, may profitably 
be used as an indication of the extent to which a pair of items fall on 
a homogeneous continuum of difficulty and overlap factorially. 

Where chance success is a factor (7. e., where c > .00), the tetra- 
choric correlation tends to vary in the same manner as the Pear- 
sonian correlation; 7. e., the obtained correlation coefficient tends to 
decrease as the tests become less similar in difficulty. Figure 1 is an 
abac showing for illustrative purposes the inter-item tetrachoric cor- 
relations expected when the items are assumed to measure a single 
ability and when c = .5 for both items. 

When the measurements correlated consist of more than one item 
and are uniform in difficulty, the value of the tetrachoric correlation 
is markedly affected by the point of dichotomization in each variable. 
For purposes of illustration, Table 4 shows the correlation surface 
that would result if we had two sets of items all measuring a single 
factor, where each set is of uniform difficulty and where each iterh is 


TABLE 5 


Tetrachoric Correlations Obtained by Various 
Dichotomizations of Distributions in Table 4 


Set 1—Dichotomization between: 
5—4 4—3 8—2 2—1 1—0 





| 1—0 * 50 60 68 15 
2—1 * 25 33 40 5 

3 3—2 * 18 18 28 27 
4—3 . O7 ll 15 18 

5-4 oo * + a * 











Dichotomization between: 


* Values cannot be determined from Thurstone tables on account of small side-entry values. 

















JOHN B. CARROLL 17 


affected by chance success. Test 1 is assumed to be composed of items 
with difficulties (k) of .4; for test 2, k = .8. These difficulties are in 
terms of the proportions who fail the items when there is no possi- 
bility of chance success. It is assumed that c = .5 throughout. The 
frequencies, expressed as proportions, have been computed by (a) 
writing the cell-frequencies in the non-chance situation, and (b) dis- 
tributing by a priori probability the chance-affected failure scores of 
those who are item failers in the non-chance situation. The Pear- 
sonian* correlation for the table is .25 by formula (44); this value 
can also be found directly from the table by conventional procedures. 
Table 5 shows the tetrachoric correlations which are obtained by tak- 
ing all the possible pairs of dichotomization points. Dichotomization 
as near the medians as possible (between scores of 0 and 1 for test 1 
and between scores of 2 and 8 for test 2) yields r:.:, = .27. The high- 
est tetrachoric obtainable, .75, is for dichotomization between scores 
of 0 and 1 for both tests. 

If it is desired to use the tetrachoric correlation to estimate the 
factorial relation between items where chance success can operate, it 
is necessary to correct the proportions in the 2 X 2 table for the effect 
of chance success. In order to minimize random error, the correction 
should not be attempted unless a large number of cases (say, 200 or 
more) are available and unless the value of ¢c can be estimated with 
considerable confidence. This correction is made as foilows: 

(1) Correct the side entries for chance success. Let p; = the 
proportion of individuals actually passing an item, whether by true 
mastery or by chance. k; is the estimated proportion who would fail 
the item when the factor of chance success is not operating. Then 


p—(1-—khi) +ekh=1—da,k,; (48) 
ee qi 
k, = —_ = -—. 49 


(2) Estimate the corrected value of the proportions inside the 
2 X 2 table. Let qi; = the proportion actually failing both items; let 
ki; = the estimated proportion failing both items when guessing is 
not possible. Then by a priori probability, 


qij = d; d; ki;; (50) 
qij 

kk, =——-. 51 

ease d, (51) 


(3) Fill out the remainder of the cells by subtraction. 
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For example, suppose a four-fold table (Table 6) has been ob- 
tained for two items where c, = c. = .25. The tetrachoric correlation 
determined from Table 6 is .29. In order to correct this table for 
chance, we construct Table 7. By (49), the estimated k values of 
items 1 and 2 are .467 and .600, respectively. By (51), the estimated 
true proportion of persons failing both items is .20/(.75)? = .356. 
The tetrachoric correlation estimated from Table 7 is now .48. 


TABLE 6 
Proportions of Individuals Passing and Failing Two Items 
Where Chance Success is Possible 



































Item 1 
Fail Pass Total 
Pass "15 40 55 
Item 
2 Fail -20= the 25 45 =, 
Total 35=4, 65 | 1.00 
V rete — -29 
TABLE 7 


Proportions of Individuals Passing and Failing Two Items 
by Actuai Mastery Alone, Estimated from Table 6 





























Item 1 
Fail Pass Total 
Pass 111 .289 400 
Item 
2 Fail 356 .244 .600 
Total 467 583 1.000 
Tretr — 48 


It will sometimes happen that k;; as estimated by (51) is greater 
than one or both of the values k; or k;. This result may represent a 
random deviation from theoretical probability or it may be due to an 
overestimation of c; or c;. In practice, it will probably be found best 
to infer from this result a correlation approaching unity unless there 
is external evidence that either c; or c; has been grossly overestimated. 

In passing it may be noted that formulas (49) and (51) are 
special cases of (21) and (29), respectively. 
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Implications 


It has been shown that Pearsonian correlations between items or 
sets of items measuring abilities tend to vary as a function of (a) the 
difficulties of the items involved, and (b) the extent to which chance 
success by guessing is possible. Several means of estimating the true 
factorial overlap between items are proposed, viz., the use of the tetra- 
choric correlation coefficient and the correction of 2 < 2 tables for 
chance by a priori probability theory. These techniques can be applied 
only under rather severely circumscribed conditions; for example, 
it is necessary that all subjects shall have made some response to 
each item involved. Nevertheless, it may be found profitable to set 
up rigorously controlled testing conditions in order to take advantage 
of the formulations presented in this paper. 

Correlations between items or small sets of items uniform in dif- 
ficulty which have appeared in the literature must be carefully scru- 
tinized for the possibility that they may have been subject to spuri- 
ous influences such as those with which we have dealt here. Factorial 
studies of items must be examined for the possibility that hetero- 
geneity of items in difficulty has given rise to spurious factors. 

Techniques for studying the factorial composition of items are 
not yet adequate. It will quite probably be found that the factor analy- 
sis of items cannot be based upon any correlation measure which can 
be derived from a pair of items alone, since all that can be established 
from a pair of items alone is the extent to which they measure on a 
“polar” scale of difficulty such that all who pass the harder item also 
pass the easier and all who fail the easier item also fail the harder. 
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INTERPRETATION OF SECOND-ORDER FACTORS 


KARL J. HOLZINGER 
UNIVERSITY OF CHICAGO 


It is shown that a “second-order” factor pattern is equivalent to 
the transformation employed in rotating an orthogonal factor pat- 
tern to an oblique form. The correlation among the second-order fac- 
= may then be interpreted as due to the original first-order fac- 

rs. 


The question is often raised, “What has become of the general 
orthogonal factor when an oblique solution is made?” The answer 
generally is that the general factor is expressed somehow by the inter- 
correlations of the oblique factors. The present paper is concerned 
with giving more precise answers to such questions. 

When correlations are employed in factor analysis, an oblique 
solution can be made only by a transformation of an orthogonal so- 
lution already known. Since we are concerned here only with com- 
mon factors, this transformation will be made in the common-factor 
space (abbreviated hereafter as c.f.s.). The analysis which follows 
will be illustrated at each stage with the seven-variable bi-factor pat- 
tern from Holzinger’s Manual. This example is chosen to illustrate 
the relationship between the general factor and the intercorrelations 
of the oblique factors, but any other pattern would have served equally 
well for the general problem. 

Let the common-factor portion of an orthogonal factor pattern 


be written in the form, 
Zj == A;,F, ’ (1) 


where the common factors F,(s = 1,2,3,---, m) are in standard 
form and A;, is the matrix (of rank m) of their coefficients for the 
variables Z;(j = 1,2, 3,---, ), which are not in standard form. 
The variables Z; are the projections in the c.f.s. of the entire vari- 
ables which also include the unique factors. The illustration of equa- 
tion (1) may be written as follows: 
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2, = .1F, + .5Fe.!+ 0 
Z,—= .6F,+ .6F.+0 
2;— .5bF,+ 8F2+0 


4.—.1F,+0+.6F; rank=3 (1)’ 
Z;—=.6F,+ 0 + .6F; 
Z,— 4F,+ 0 + .8F; 


Z,=.8F,+0 +0. 


An oblique solution may be obtained from equation (1) by means 
of a transformation derived from the composite variables, 


V,=CnuF, + Cy»F 2 + 0+ + Cm mn 
Vo = CorF, + Cook's thee + Comin (2) 


Vin = CmaF’'s + CmoF2 + +++ Coml'm - 
These composite variables are formed by adding the equations within 


the m subgroups member by member. For the illustrative example, 
therefore, 
V,=1.8F,+19F.+ 0 
V-=1.7F,.+ 0 +2.0F; (2)' 
V,= 8F,.+ 0 + 0. 


The standard deviations of the composite variables (in the c.f.s) is 
given by the formula 


Cs te VCrn “+ C*.. + eee + CAs ’ (3) 


which involves only the C’s for uncorrelated factors F,. For the 
illustrative example, 





1 
a,” = (1.8)? + (1.9)? = 6.85; o,=2.617; —= .382 


O1 


1 
a2? = (1.7)? + (2.0)? = 6.89; o, = 2.625; —= .881 (3)’ 


02 


1 
o37= (.8)?; o,=— .800; —=1.250. 


o3 


The composite variables (2) are next changed to standard form 
by dividing V, by «, to give the equations, 


(4) 
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For the illustration these equations become 


1.8 1.9 
L,=———__ F, + ——-F. + 0 =, oe + 
2617, .- 287%. vigue stays ’ 


2.0 
= ——_ F, + 0 + — F,=. Fy + = 
L, 2.625 2.625 ° - . sisi 
8 (4)’ 
| rh 0 + 0 om r+ 0 + € 


The standardized variables L, are defined here as the oblique factors; 
that is, they are the composite variables V, standardized in the c.f.s. 

The coefficients in equation (4) form a transformation matrix 
employed when an oblique solution is desired.* The rows of this ma- 
trix are the direction cosines of the oblique factors L, with respect 
to the original orthogonal factors F, . 


Equation (4) is also a second-order factor pattern, the coeffici- 
ents of the factors being given by the columns. The correlation 
amongst the factors L, may therefore be interpreted as due to the 
original orthogonal factors F’, as illustrated by equations (4)’. The gen- 
eral factor in equations (1)’ is the same as the general factor in equa- 
tions (4)’, but the group factors in (1)’ are represented in equations 
(4)’ as unique factors in the c.f.s. This type.of uniqueness will occur 
only when A;, is of bi-factor form. 

In the above analysis it was assumed that the number of com- 
posite variables is equal to the rank of the matrix (1). This assump- 
tion is necessary in order to obtain exact relationships between fac- 
tors in the same space. In case only m—1 composites are employed, 
equations of the form (4) may be obtained, but the relationships 
shown will be only approximate because different spaces are involved. 

If a bi-factor pattern with group factors for all variables is em- 
ployed, the number of composite variables will usually be one less 
than the number of common factors. It has sometimes been argued 
that the rank determined by the composites is the correct one and 
that the extra factor in the bi-factor pattern is therefore unwarrant- 
ed. This reasoning is very circular, because we never know the exact 
rank with actual data involving several factors. It might be argued 


* The oblique solution consists of the structure S;,, which gives the correla- 
tions between tests and factors, and the pattern B;,, which represents the co- 
efficients in the linear expressions between tests and oblique factors. The struc- 
ture is obtained from the above transformation 7,, and the pattern A;, by the 
equation A;,7,, = S;,. The pattern B;, is then obtained from the equation 
B,,=S;,%,,-1, where 0,, is the matrix of the intercorrelations of the factors L, . 
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equally well that the rank of the oblique solution is lower than the 
data warrant. 

A more general treatment of the above problem will next be giv- 
en. After the composite variables have been chosen and the o, com- 
puted, the above transformation matrix, coefficients of equation (4), 
may be expressed in the form P,;A;,, where 




















fe ee | 
|—- — 0 O 0 
}) Oy oO; 
| 1 1 (5) 
io 0 _- — 0 
P,; ess | G: 02 
1 
0.0 -- 0 0 _ 
} Om 
Equation (4) may then be written in the form, 
L,=([P.;A;]F,, or L=(PA)F. (6) 
For the illustrative example the product P,;Aj;, is 
75 0 
6 .6 0 
082 .382 .382 0 0 0 0 | 5 8 0 || .688 .726 
0 0 0 .381 381 381 0 |x | .7 0.6) = 1 648 0 
ee 2-7 a. OAR GS OS: Fk 8 
408 
8 0 0 














the matrix on the right being the coefficients in equations (4)'. The 
transformation PA thus sections the matrix (1) into m composite 
variables and standardizes these composites in one step, giving the 
required form of the transformation matrix for future analysis. 

It is of some interest to relate the above transformation P,;A;, 
to one more generally employed. Let the oblique pattern for all 7 
variables be written in the form, 


Z;=B,,L,, (7) 


Z=BL. (8) 


Assuming that patterns (1) and (8) are known, the problem is to 
show the relationships between factors. For an oblique pattern B 
and an orthogonal pattern A, the problem is to find a matrix T 


such that 


or more simply , 
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BT=A. (9) 
Since B does not ordinarily have an inverse we cannot simply write 
T=B(A. (10) 
Noting, however, that 
(BB) (BB) =I, (11) 
and premultiplying both sides of (9) by (B’B)-*B’, we find 
T= (B'B)“BA, (12) 


which is the form generally employed. 


It will next be shown that PA of equation (6) is identical with 
T. Substituting the right member of (6) in equation (8) gives 


Z=—BPAF. (13) 
From equations (1) and (18) it is apparent that 
A=BPA. (14) 
Substituting the right member of (14) in (12) gives 
T= (BB) (BB) PA=IPA=PA. (15) 
The transformation PA is thus equivalent to T but is obviously much 


simpler. 
The answer to the question at the beginning of this paper would 


appear to be that the correlation amongst factors may be interpreted 
as an expression of the original first-order factors. The factorial job 
is done, however, when a solution of the form (1) or (8) and its ac- 
companying structure are obtained. It is therefore not recommended 
that second-order factors be explicitly found in the hope of discover- 
ing something psychologically new. Such factors merely furnish a 
statistical interpretation of the aspects of (1) and (8). 
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A NOTE ON STEADY STATES AND THE 
WEBER-FECHNER LAW 


S. SPIEGELMAN 
WASHINGTON UNIVERSITY 


J. M. REINER 
UNIVERSITY OF MINNESOTA 


The steady state of a simple reaction system has been shown to 
have some of the properties of a psychophysical discrimination sys- 
tem, including the possibility of deducing a generalized Weber-Fech- 
ner Law, both in integral form and in difference form. The Weber 
ratio so deduced is not constant, and its dependence on stimulus in- 
tensity is exhibited. The dependence of the difference limen on the 
internal threshold is discussed; it is found that in general there is 
a finite value of this threshold for which response is impossible. 
This critical threshold is lower for higher values of the reference 
stimulus intensity. Similarly, it is shown that the difference limen 
and the Weber ratio, for a fixed value of the threshold, become in- 
finite (i.e., discrimination is impossible) for a value of the stimulus 
intensity which in general is finite. 


In the course of investigating the theoretical properties of the 
steady state in biological systems, the authors found a relation which 
is strongly suggestive of the Weber-Fechner law. This analogy has 
been mentioned by Burton (1), unfortunately without proof or elab- 
oration ; Burton states that “The mathematical proof is complicated.” 
The mathematical development is in fact rather simple; and it is the 
purpose of this note to present it, and to indicate some of the conse- 
quences which may be deduced from this relation. 

The theoretical model in question is a physical system in which 
a substance A appears from a source S. A is transformed chemically 
into another substance B , which disappears into a “sink” Z. (The 
source and sink may be conceived of as the environment of the physi- 
cal system, from and into which the substances A and B may diffuse.) 

Let the concentrations of A and B and of the source and sink be 
represented by the letter c with appropriate subscripts. Then the 
variation of A and B with time is prescribed by the following pair 
of differential equations: 
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de, 
=k, os (ko =e k)¢, + k’ Ce; 
dt 
1 
~ (1) 
dt = 0; a (k’ = kz) Cp -- kz Cz. 


In the above, k, and kz measure the rate of diffusion between the sys- 
tem and the source or sink, respectively, while & and k’ are the ve- 
locity constants (2, p. 952) for the transformation of A into B and 
B into A , respectively. 

These equations may be solved for c, and cg as functions of the 
time ¢. It is then easily shown that these functions, with increasing 
time, always approach stationary values. However, these values may 
be found without solving the differential equations. For, when the 
concentrations are stationary and no longer change with time, it fol- 
lows that: ; 

Beeps: pitay (2) 
dt dt 
Accordingly, we may fulfill this condition by setting the right-hand 
sides of the differential equations equal to zero; this gives a pair of 
simultaneous linear equations which may be solved for c, and Cz. 
The solutions are: 








__ ho Cg(k’ + Kez) + Ke ez On 
ih a 


kz Cz (ko + k) + ky es k 
Cp = ° 
Ie (Kk! + kez) + ke kez 


Let us now consider a possible interpretation of this simple model. 
Suppose that an external stimulus acting upon a biological system in- 
creases the rate at which A is transformed into B; we might then take 
k as a measure of the stimulus intensity. Suppose further that the 
level of the organism’s response is determined by the concentration 
of the product B; we might then take cz as a measure of the response. 
If we suppose that measurements are made under stationary condi- 
tions (i.e., that the time required to attain the stationary state is 
short compared with the time of observation), then the second of 
equations (3), giving cg as a function of k, represents the relation 
between stimulus and response. 

According to Burton (1, p. 333), the Weber-Fechner law states 
that “progressive equal increments of intensity of stimulus produce 
decreasing increments of response.” In terms of our symbols, this 





C4 


(3) 
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would mean that the plot of cz against k has progressively decreasing 
slope. In order to prove that equation (3) satisfies this requirement, 
it is only necessary to show that the second derivative of cs with re- 
spect to k is everywhere negative. This follows immediately by dif- 
ferentiating twice: 

Pien -. gg See elh * Bed + © Bees) (4) 

dk? [kek + ko (k' + kz)]? 
Since all the quantities in the above expression are intrinsically posi- 
tive, it is clear that this expression is everywhere negative. 

According to the classical formulation of Fechner, cz; should be 

a linear function of log k. However, it has long been known that 
data obtained by psychophysical measurements fulfill such a relation 
only over a limited middle range of stimulus intensity; when the in- 
tensity scale is extended in both directions, the data usually fit an 
ogive or S-shaped curve, which has an approximately linear portion 
in the neighborhood of the inflection point near the middle (3, 4, p. 
114). It is therefore of some interest to show that equation (3) also 
meets with this condition. For the sake of simplicity in writing the 
formulas, we introduce the following abbreviations: 


a=ky tg + kez Cz 














b= ko kz Cz 
c= kz (5) 
d=k,.(k' + kz) 
u=logk. 
Equation (3) then takes the form 
. aet+b 
= ‘ 6 
1 cety+d (6) 
The first and second derivatives with respect to u are: 
dez; (ad—bc)e* 
= ; (7) 
du (c e“ + d)? 
dc,  (ad—be)(d—ce)e 
= : (8) 








du? (ce“ + d)* 


From these equations we can easily deduce the properties of the 
curve. From equation (6) we see that, as u approaches negatively 
infinite values, cs approaches the constant value b/d; as u approaches 
positively infinite values, cs approaches the constant limit @/c. More- 
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over, from the definitions (5) it is easily shown that b/d is less than 

a/c. 

Examining equation (7), we note that ad — bc is a positive quan- 
tity according to (5). The first derivative is therefore always great- 

er than or equal to zero. Letting u approach positive and negative 

infinity, we find that the slope is equal to zero at both limits. 

From equation (8) we may find the inflection point by setting 
the expression for the second derivative equal to zero. There is only 
one finite solution, whose value is log (d/c). 

Thus we have a curve in which the response passes from a posi- 
tive plateau upwards through an inflection point to another plateau. 
This is precisely the description of an ogive. 

So far we have, among other things, given a proof for the asser- 
tions offered without proof by Burton. However, a relation between 
stimulus intensity and the’ corresponding level of response is not pre- 
cisely what was contemplated in the original formulation of the Web- 
er-Fechner law. Many of the relevant experiments are concerned 
rather with the relation between increments of stimulus and response, 
and in particular with finite increments rather than differentials. It 
seems of value, therefore, to see what our theoretical model yields in 
such a situation. 

Let the (in general finite) increment of k be denoted by 6k; let 
the corresponding increment of cs be denoted by écz. We obtain the 
expression for écz by using equation (3), with k + 6k substituted for 
k, and subtracting from this the original form of equation (3). We 
have then 

Ko Cs k(k’ + kz) + k’ kz Cz 


[ko(k’ + kez) + kez kU Lho(k’ + kz) +heek + hed k) 

(9) 
This may be compared with the original formulation of Fechner, 
which in our notation would give (4) 
bk 


6¢,;=—K—. (10) 
: k 





bCg—k, bk 


In our case, not only is the response increment not simply inversely 
proportional to k , but it depends on the stimulus increment in a more 
complicated way than that given by direct proportionality. In par- 
ticular, our formulation predicts that, for sufficiently large incre- 
ments of stimulus intensity, the increment of response approaches a 
constant limiting value which is independent of the stimulus incre- 
ment. In other words, if the jump from one stimulus intensity to a 
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new one be sufficiently large, the responding organism can tell that 
the new stimulus is larger, but no longer can estimate how much larg- 
er it is. 

It may be objected to this, as well as to most of our formulation, 
that it implies the ability of subjects in psychophysical discrimina- 
tion experiments to make quantitative estimates of their perception 
of stimulus intensities or stimulus differences, and that in fact they 
can only make such judgments as “greater,” “less,” or “equal.” We 
would point out to such critics that this is chiefly a consequence of 
the way in which such experiments are set up and in which the types 
of judgments to be given are determined by the observer. Naturally, 
the accuracy of such judgments by subjects will vary considerably 
with their acquaintance with the kind of magnitude concerned, and 
their consequent possession of what one might call an internal scale 
of measurement; but this does not at all affect the principle of our 
quantitative model. To anyone who doubts that human beings can 
ever estimate magnitudes quantitatively, we recommend observation 
of a surveyor or carpenter estimating lengths, or a butcher measur- 
ing out a pound of hamburger. 

Let us consider now the situation in which the stimulus incre- 
ment may be characterized as a just noticeable difference. It is implic- 
it in the theoretical description of such a situation that the substance 
B which determines the response does not produce a discriminating 
response for every increment by which it may increase, but only if 
the increment equals or exceeds a certain threshold value. We want 
now to find the stimulus increment which corresponds to this thres- 
hold; this would be the just noticeable difference of stimulus. 

We call the threshold 7. Let us substitute 7 for 6 cz in equation 
(9), and solve for 6k. This gives 


T[ko(k’ + kz) + kz k]* 


i ko [ho es (Kk + kz) + kh’ kez cz] — T klk. (kh + kz) + kz ky 
(11) 


dk 





The difference limen 5k increases with k as should be the case 
(4, p. 126, p. 163). 

The most obviously striking thing about this relation is the way 
in which it exhibits the effect of imposing a threshold on the sub- 
stance B. For as 7 increases, dk does not merely increase more or 
less rapidly with it, as one might at first expect. There is in addition 
a finite value of 7 for which 6k becomes infinite. That is to say, if 
the discrimination threshold of an organism reaches a certain point, 
the organism can no longer discriminate any finite change of stimu- 
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lus, no matter how large. This limiting value of the threshold is given 
by that value of T for which the denominator of equation (11) van- 


ishes, namely, 
T = ko[ Keo Cg(k’ + kez) + k' kez cz] [Kal ko (k +kz) +khzk]. (12) 


This value is a function of the previously existing stimulus level k. 
As may be seen from equation (12), it varies inversely with k; the 
higher the previous stimulus intensity, the lower is the threshold val- 
ue required to block completely the discrimination of a new stimulus 
intensity. If the initial stimulus is sufficiently large, the discrimina- 
tion is impossible for an organism with any finite threshold, no mat- 
ter how small. 

The value of 6k for k = 0 is the stimulus limen or RL (4, p. 111). 

It is instructive to write equation (12) in a form which explicitly 
gives 6k/k, the form perhaps most familiar in the psychophysical 
literature as the Weber ratio. This function in our case is 


ok T[ko(k' + kz) + kz k]? 


lee Keo Rep Cg (K’ + tz) + Ke Ken Cr] — T kez [ ko (Kk + kz) + kek} 
(13) 





For small values of k, the function is very large, and it decreases as 
k increases. However, the denominator vanishes and the function be- 
comes infinite at a finite value of k given by 


va i Cy (k’ + kz) + k' kz Cz ee fi kez (k’ = kz) 
k=— ; 
kz T kz 


The function must therefore necessarily have a minimum which pre- 
cedes this sharp upturn (4, p. 136). The theory predicts, in short, 
that for a finite (though possibly quite large) value of the initial 
stimulus, no finite stimulus. increment, however large, can be discrimi- 
nated. From equation (14) it is seen that this critical value of k 
varies inversely as the threshold 7 , thus exhibiting the converse of 
the relation shown in equation (12). 

The existence of such threshold effects as the above seems reason- 
able enough in view of the known facts. It is by no means easy, how- 
ever, to see what feature of the mechanism is responsible for this re- 
sult. A close study of the deductive steps involved in arriving at such 
predictions suggests that they are connected with the fact that cs, as 
given by equation (3), approaches a finite limiting value with increas- 
ing k , but never is capable of an infinite range of values. It is illumi- 
nating to compare the present model with such a situation as is pre- 


(14) 
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sented by molecules crossing a potential energy barrier, where only 
those molecules with sufficiently high kinetic energies are able to cross 
a barrier of a given magnitude. Since the usual kinetic energy distri- 
bution functions, such as the Maxwellian distribution, admit kinetic 
energy values approaching infinity, there are always some molecules 
present which can cross any potential barrier no matter how high; 
the number crossing drops to zero only as the barrier becomes in- 
finitely high. But when the quantity involved in a threshold never 
exceeds finite magnitudes, one would expect that the threshold would 
block completely even when it is only of finite height. 

One further point may be worth mentioning. If k or T were in- 
creased beyond the critical values given by equations (12) or (14), 
we would enter a region in which 6k is negative. It is not clear to the 
authors whether to attribute any particular significance to this cir- 
cumstance, or to regard this region as being psychologically meaning- 
less. Literally interpreted, it would mean that in this region a de- 
crease of stimulus intensity can be discriminated, but not an increase. 

In this connection it should be noted that we have previously 
failed to specify the sign of T , and have in fact implicitly treated it 
as a positive quantity. However, it is just as reasonable to suppose 
that a decrease of cg should correspond to a perception of decreasing 
the stimulus intensity; in that case we would have to regard T' as be- 
ing positive whenever the organism supposes itself to see an increase 
in the stimulus, and negative whenever it supposes itself to perceive 
a decrease in the stimulus. 

In that case, the possibility exists that T and 6k may be of oppo- 
site sign. This is perhaps more obvious if we solve equation (11) for 
T , giving 

Ko Cg(k’ + kz) + ki’ kezez 





T=hkodk 
{Ko (K! + kez) + ez k} {Heo (k’ + lez) + lez, le + lez 6k} 
(15) 


Here, if 6k is negative, T also is negative until 6% reaches such a value 
that 
kyo (k +kz) +kzk 


kz 





\6k| > (16) 


Then T is once more positive. It is tempting to attempt to give an 
interpretation to such reversals of sign; but any interpretation one 
can think of seems rather implausible. It is true that errors occur in 
discrimination experiments, such that an increase of intensity is some- 
times perceived as a decrease (4, p. 131). But such errors do not ap- 
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pear to be sharply divided into zones depending upon the magnitude 
of the increment of stimulus intensity. On the contrary, they appear 
to occur with some kind of statistical distribution at almost all inten- 
sity levels [see for example the data on the Miiller-Lyer illusion in 
(4)]. It seems likely that such errors of discrimination are concerned 
with mechanisms of a statistical character, which may be superim- 
posed upon such a fundamental mechanism as our present theoretical 
model represents, but entirely distinct from this mechanism. It there- 
fore seems advisable to regard these regions of reversal of sign as 
meaningless by-products of the formal relations (a situation by no 
means uncommon). 

A mechanism closely related to the one discussed here was earlier 
developed by Hecht, (3, 5). This dealt with the specific field of visual 
discrimination, and so may be regarded as a special case of our gen- 
eral considerations regarding the steady state. Hecht was of course 
able to interpret the substances of his theory concretely in terms of the 
photosensitive substances of the retina and their precursors and 
breakdown products. His expressions were in another sense some- 
what more general than ours, since he assumed one of the chemical 
reactions to be bimolecular. He did not, however, include a source 
and sink in his system. Thus his steady state is an equilibrium state, 
rather than a general steady state in which a non-vanishing flow of 
matter and energy is involved. 

It is obvious how one could examine the effect of the other para- 
meters of the system on cz, if one were inclined to suspect any one of 
them as a more likely representative than k for the stimulus intensity. 
This development may be left aside for the present. We may note in 
passing, however, that a change of either cs or cz can be ruled out 
immediately, since either one would produce a response increment 
simply proportional to the stimulus increment, and independent of 
the initial stimulus level. 

It should be emphasized that we are not proposing the foregoing 
simple model as a “theory” of psychophysical discrimination pro- 
cesses, but simply as an illustration of what can be done with such 
an approach. It would be premature to attempt an interpretation of 
the model by arguing that our substances A and B might be identified 
with acetylcholine, potassium ion, or the like. Moreover, if one of 
these substances or any other should prove to be fundamental in the 
functioning of the central nervous system, it is almost certain that 
such a function would have to be represented by a more complicated 
reaction system than the one studied here. 
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Fifty-two subjects differing in sex, age, education and domicile 
(rural or urban) were given the problem of judging the height of 
an upright board in a natural setting. A preliminary analysis was 
made on the basis of the simple initial ratio method, both for the 
original data in feet and for original data converted to log units. Be- 
cause the effects of interaction of the several variables made the re- 
sults of this method inconclusive, the analysis of variance tech- 
nique, as described by Yates( 11) for data where the classes are 
not equally represented, was applied. This technique showed that, 
while together the four factors markedly affected judgment, sex 
had no significant individual effect, age had the biggest individual 
effect but possibly a spurious one, education and domicile had sus- 
piciously large individual effects, and the effect of the four factors 
may be regarded as simply additive. The relation of the findings to 
those of previous investigators is discussed. The authors regard as 
an important result of the analysis the guidance it offers in the 
design of further experiments, since it demonstrates the value of 
equal representation for all c'asses into which data are to be seg- 
regated. 


I. Problem 

In the most exhaustive study on size constancy, the one by Hola- 
day (4), it was found that with the normal (objective) attitude and 
binocular vision, an object is usually slightly underrated as to its 
‘real’ size. Specifically, constancy judgments of 8-cm. cubes at dis- 
tances up to 8 m. result in a constancy ratio of 82.3. This means that 
on a scale of 100, with projective size at the zero end, and actual size 
at the high ends, the cubes were seen as 17.7 shorter than the actual 
size. Not expressed in terms of the constancy ratio, but rather in 
percentages of actual size, the underestimation would in any event 
be less than 17.7. How much less would depend on the value of the 
projective size used as the basis of the constancy-ratio calculation. 
But this information is not given in the Holaday study. His subjects 
were 10 graduate students in psychology. The study, like all others 
on size constancy, was conducted in a laboratory setting. 


* Responsible for the experiment and general interpretation. 
+ Responsible for the statistical analysis. 
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It was the original purpose of the present study to find out how 
the results of size estimation in a natural setting would compare with 


Holaday’s findings. 


II. Method 

1. Procedure. The important difference in procedure of the pres- 
ent study and the laboratory constancy experiments is that in the 
latter regular psychophysical procedure (method of limits or of con- 
stant stimuli) with a series of comparison objects is used, whereas 
in the present study we had to resort to purely verbal judgment. 

The task was the judgment of the height of an orange-colored 
upright board, which was 7 feet high and 8 inches wide. Mounted on 
a tripod, the lower end of the board rose 2 feet above the ground, so 
that the entire board was visible, unobstructed by grass. It was erect- 
ed at a distance of 1000 feet from the observer, in a landscape of grass 
and shrubbery with medium tall trees in the background, about 1500 
feet from the point of observation. The erection of a series of com- 
parison objects which would have had to reach a considerable height 
was not possible. Thus our results are not strictly comparable to 
those obtained by the usual constancy procedure. Yet we would doubt 
that the differences found are altogether due to the difference in pro- 
cedure. Extremely high judgments were verified as follows: The 
observer’s attention was called to the barn in front of which he stood, 
and he was asked to compare the stimulus object to the corner height 
of the barn which was 20 feet. The one observer who estimated the 
stimulus as being 30 feet said, “It is about half as high again as the 
barn.” Others, according to their judgments, said it was about as 
high as the barn or lower. This way of verification resembles con- 
stancy procedure. 

The stimulus was hidden from the observer until he had reached 
the point of observation behind the barn. Then he was asked: “Do 
you see that orange thing over there? How tall does it look to you?” 
The response was invariably in even feet. 

2. The Sample. The study was conducted at the country place 
of one of the authors in a remote part of Columbia County, New York. 
For several weeks, any person who happened to pass by and had a 
few minutes to spare was asked to step to the observation point and 
render his judgment. In this way a sample of 52 was gathered which 
differed objectively in 4 factors: Regarding sex, 26 of the observers 
were women and 26 men. In age they ranged from 14 to well over 
60. Education varied from Ph.D. and M.D. degrees down to virtual 
illiteracy. As to domicile, 31 were urban and 21 rural. The urban 
group, without exception, lived in the New York metropolitan area 
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and had come to the country for a short visit. Domicile was largely 
related to occupation and socio-economic status. Of the 9 urban men, 
4 were professional and 4 upper-middle-class business people. The 
22 urban women were all upper-middle-class or middle-class; 8 were 
or had been professional women and 6 business women. Of the 17 
rural men, 6 were tradespeople (carpenter, roofer, baker, electrician) , 
9 were farmers and road construction workers, one was a youth of 
14, while only one had a white-collar occupation. The 4 rural wom- 
en were young farm girls or farmers’ wives. 

Only inspection of the results showed that any one of these 4 
factors may have had something to do with the judgments. Thus no 
personal data were recorded at the time of the experiment, except the 
name. However, all observers were personally known to the experi- 
menter so that the needed information could be filled in subsequently. 
Sex and domicile were obvious dichotomies. Accordingly, educational 
status also was considered to divide the sample into two groups, de- 
pending on whether they had been to college or not. The former will 
hence be referred to as “more educated,” the latter as “less educated.” 
For age, the dividing line was drawn from inspection of the data and 
at an estimated age of 48. Those below 48 will hence be referred to 
as “young,” the others as “old.” On this basis the sample would con- 
sist of 16 classes, but was actually represented in only 10 classes and 
with unequal frequencies as shown in Table 3. 


III. Results 
Judgments at the upper end were spaced at wider intervals than 
at the lower end, which is in accordance with the Weber-Fechner law. 
Therefore the general practice with this type of data was followed 
and each score converted into an arbitrary log unit. This was done 
from the readings on semi-logarithmic graph paper. When calcula- 
tions were made on this basis, the values were reconverted into feet 


where this was called for. 


1. The sample as a whole. Distribution of the judgments by all 
52 observers is shown in Table 1, in feet and log units. The means 
as obtained from the two scales are given, and that in log units is re- 
converted into feet. 

The mean judgment of 8.6 feet (reconverted) for the 7 ft. board 
represents an overestimation of 23%, which compares with an under- 
estimation of at least 17% in Holaday’s study. However, his sample 
was limited to students. If we calculate from our sample the mean 
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TABLE 1 \ 
Distribution of Judgments, in Feet and Log Units 
Judgments Frequencies 
feet log units 
3 1 . 
4 5 2 
5 8 5 
6 11 9 Mo == 82 
7 13 2 
8 15 5 M,,, — 16.3 
9 17 2 
10 18 “| M.... 2 88 
11 20 1 (reconverted ) 
12 21 7 
15 24 4 
18 27 2 
20 28 3 
30 34 1 
n= 62 








for those 17 individuals who would be most comparable to the stu- 
dents, i.e., the 10 young urban more educated women, the 6 young ur- 
ban more educated men and the 1 young rural more educated man (see 
Table 3), we obtain a mean of 5.9 feet (reconverted). This repre- 
sents an underestimation of almost 16%, a figure coming close to 
Holaday’s. Yet in the light of the discussion below, it would seem 
that this similarity is purely a coincidence. 

2. Preliminary analyses.* Four factors were recognized: sex, 
age, education and domicile, each of which divides the observers into 
two groups. The mean judgments, both in feet and log units, for each 
of the two groups obtained by the use of each factor are shown in 
Table 2. Lower judgments are given by the women, the young, the 
more educated and the urban. If, however, the differences obtained 
from these classifications are tested by comparison with their stand- 
ard errors, it is found that they are not all equally significant. The 
quantity ¢ given in Table 2 is the ratio of the difference to its stand- 
ard error. It may be found either directly in this way or by a simple 
analysis of variance (see 7). Each ¢ has 50 degrees of freedom, since 
out of the 51 degrees of freedom between 52 observers, 1 is used up 
in making the comparison between men and women, or between old 


and young etc. 


*The statistical notation used throughout follows that of Fisher (2) and 
Mather (7). In particular, S is used to indicate summation, a process which is 
often shown by 2. 
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Whether the calculation is made using the initial judgments in 


feet, or the converted judgments in log units, ¢t has a_probability of 
somewhere near 0.1 for the men-women difference, which is thus not 
significant on either test. For each of the other three factors, on the 


other hand, ¢ has a probability of 0.001 or lower, using either feet or 
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log units, and so each of the three factors might be considered to be 
exercising a real effect on the judgment of height. 

This simple test cannot, however, be regarded as trustworthy, 
for the following reason. When all four classificatory factors are 
used, we can simultaneously distinguish 16 classes, young more edu- 
cated urban women, young more educated urban men, young more 
educated rural women and so on. It will be seen from Table 3 that 
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only 10 of these 16 classes are represented by actual observers 
in the data, and these 10 classes are not equally filled, the numbers of 
observers falling into them varying from 1 to 12. Thus, in taking, 
say, the comparison of rural as opposed to urban, the result will not 
be independent of the other three factors. There are, for example, 
20 less educated rurals and 16 more educated urbans, against 15 less 
educated urbans and 1 more educated rural. 

Furthermore, it can easily be shown by the calculation of a rank 
correlation that the factors supplement one another. Each of the 10 
classes was given a score for each factor. The score of 1 was given 
when the class contained the observers giving the lower judgment on 
the classification in question, and correspondingly a score of 2 was 
given to the higher judging group. Thus women, young, more edu- 
cated and urban all take scores of 1, while men, old, less educated 
and rural all take scores of 2. A female, old, more educated and ur- 
ban would, for example, have a score of 1.2.1.1. The scores for each 
of the 10 classes are given in Table 4. The four digits in each score 
are then added to give a total index which can be correlated with the 
mean judgment (calculated from log units and reconverted to feet) 
of the class. The rank order correlation obtained in this way is .80 
and shows well how the effects of the four factors supplement one 
another. Thus our simple tests of the significance of effect of each of 
the four factors must be misleading. To take the example quoted 
above, there are a greater number of rural less educated and urban 
more educated than there are of rural more educated and urban less 
educated. The simple difference between urban and rural must thus 
include a supplementary reinforcing contribution from the more edu- 
cated versus less educated classification. A more rigorous statistical 
analysis is necessary. 

3. Full Analysis. The method of analyzing data of the present 
type, where the classes are not all equally represented, some even 
being entirely missing from the multiple classification, is discussed 
by Yates (11). He describes a method of fitting constants, to repre- 
sent the effects of each classificatory factor, by means of the least 
squares technique. 

As applied to our data, we propose in effect, on this method, to 
find a best fitting constant b, which may be taken to represent the 
effect of sex on judgment, a second b, to represent the effect of age, 
b,. and bz to represent education and domicile effects. These constants 
are calculated so that the summed squares of the differences between 
observed judgments and those expected on the basis of the four con- 
stants are at a minimum. The analysis is formally equivalent to that 
of a multiple regression analysis, fully described by Fisher (2) and 
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Mather (7). Those accounts will give all the details of procedure. 

Let us first analyze the data in the form of the initial judgment 
in feet. The first step is, as in calculating the rank correlation, to 
assign scores representing the classifications. These may be arbitrary 
since each factor divides the data into only two classes, and hence —1 
and 1 have been used in each case for the calculations. (The corre- 
sponding scores of 1 and 2 as used in Table 4, or indeed any other 
scores, however, could have been taken without affecting our ultimate 
tests of significance.) 


Where 

x, is the sex score, x, = —1 for women and x, = 1 for men: 
similarly 

X_ is the age score, X, = —1 for young and x, = 1 for old, 

x- is the education score, x, = —1 for more educated and 
x. = 1 for less educated, 

Xa is the domicile score, 72 = —1 for urban and xz = 1 
for rural. 


When y is the observed judgment in feet and Y the corresponding 
expectation derived from the constants, we can represent our prob- 
lem as that of calculating the regression coefficients b, --- bg in the 
equation 


Y= + 6,(2, — &) +: b,(%4 = &) + b6(t. — &) + Os Se Be) 
such that S(y — Y)? is a minimum. 


The appropriate values of b, --- ba are given by the solutions of 
the four equations: 


b, S(x- rte #,)* + ba S[ (aa bese La) (Xz cas £5) ] + b. S[ (x. 55% Xe) (x, oa &;)] 
7 ba SU (2a — £a) (2%. = %)] = SL (x. ee £,) i ene 9)] 


b, S[ (x. Fay Ze) (2, ay, #,)} a ba S(2a oy #)* 7 be S[ (xe or He) (X- sag #.)] 
+ ba S[ (4a — £a) (Ze +o &) 7 = S[ (a reg #.) (y me ¥)] 


bs S[ (x5 ea £,) (2. Agi Xe) ] + ba S[ (Xa 5 Xa) (Xe a £e)] 
+6, 8(e, — &,}* + GS ma — 8) (2. — 2) ) 
b, S[ (a, — &) (fa — Hz) ] + be S[ (Ha — Ha) (he — He) 
+ b. S[ (a. sap &e) (2a oe Xa) J = ba S (Xa £55" 
= S[ (xa — fa) (y — 9)]. ps 
The items of the types S(z, — Z.)? and S[(a. — &) (a — #,)] 
are found from the scores described above. Now S(2, — #,)? = S(2,?) 


S? 
_ (2) where 7 is the number of observers. There are 26 observers 
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(women) with a score of x, = —1 and 26 (men) with xz, —1. Thus 


1 1 
S(x,?) = 26 X (—1)? + 26 X 1°? = 52 and — S?(z) =e (~§} + 
n 


26 X1]=0. 

So S(z, — #,)* = 52 — 0 = 52. 

sale ‘ a S (Xa) S (Xs) 
Similarly S[(z. — #.) (% — #,)] =S(%.%.) — —————__. 
n 

We have 17 young females with x, = —1 and x, = — 1, 22 young 
males with x, = 1 and x, = —1, 9 old females with x, = —1 and 
Y= 1, and 4 old males with x, — 1landwz,—1. 
Then 


S(z.%,) = (17 X (—1) X (-—1)] + [22 X 1X (—1)] 
+ (9X (-1) X 1) + [4X1X1]=0. 


. 1 
S(xz,) =0 and S(x,) = —26 giving — S(x,)S(x,) =0. 
n 


Thus 
S[ (x. — #.) (2, — #,)] = —10 — 0=-—10. 


The remaining sums of squares and sums of cross products, in- 
cluding those involving y, the judgment in feet, are found in the 
same way. The four equations may then be written down. It is, how- 
ever, preferable to solve the following four sets each of four equa- 
tions, the left sides of which are the same as those given above, but 
the right sides of which have 1,0,0,0; 0,1,0,0; 0,0,1,0; and 0,0,0,1 sub- 
stituted for S[(x, — #,)(y — 9)] ete. The solutions of these latter 
equations are necessary for the calculation of the standard errors and 
also aid other calculations. 

We thus have the following equations for solution: 


52.00b, — 10.00b, + 6.00b, + 26.00b, = 1,0,0,0 
—10.00b, + 39.006, + 17.006. + 3.00b,= 0,1,0,0 
6.006, + 17.00b, + 45.77b, + 19.46b; = 0,0,1,0 
26.00b, + 3.00b, + 19.46b. + 50.086, = 0,0,0,1 


and the solution gives us 16 values generally denoted as 


Cu Ca C; 1 Cu 
Cio Coo Cae Car 
Cis Cus Css C43 
Cru Ca Car Cus 


etc., where C,, is the value obtained for b, from the first set of equa- 
tions, C,;, that for b. from the fourth set and so on. This C matrix of 
16 solutions turns out to be 
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0.02853 0.00871 —0.00055 —0.01512 
0.00871 0.03365 —0.01301 —0.00148 
—0.00055 —0.01301 . 0.03151 —0.01118 
—0.01512 —0.00148 —0.01118 0.03225 


Now 


b, =Ciu S[ (x, — &) (y — 9)) + Cr S[ (2. — Z.) (y — ¥)] 
+ Cis S[ (te — Ze) (y — Y)] + Cis SE (Xa — Za) CY — ¥)] 
= (0.02853 X 54) + (0.00871 X 106) — (0.00055 X< 119.38) 
; — (0.01512 X 113.23) 
= 0.6862 
ba = Car S[ (x. 7, Z;) (y ra 9)] + C22 S[ (2a ae, Xa) (y zis ¥)) 
+ Cos S[ (xe — €) (y — Y)] + Cau S[ (xe — Z2) (Y— 9)] 


= 2.3165 
be = 1.0870 
ba = 1.3486. 


The sum of squares, S(y — ¥4)?, between the 10 classes is found to be 
635.6352 of which an item 


b, S[ (x, — &,) (y = ¥)) + ba S[ (2%. a Xa) ee y)] 
+ be S[ (Xe — Fe) (y — 9)] + ba SE (Xa — Za) (Y — 9), 


or 564.5057, is ascribable to the action of the four factors: sex, age, 
education and domicile. This leaves 71.1295 as residual variation due 
to difference between y and Y, i.e., S(y — Y)?. Since there are 9 de- 
grees of freedom between the 10 classes, 4 of which are taken up in 
calculating b, --- bi, this residual sum of squares corresponds to 5 
degrees of freedom. 

There is, however, a further residual sum of squares, viz., that 
from variation in judgment between members of the same class. This 
can be found by calculating the sum of squares between all 52 judg- 
ments, which turns out to be 1435.6923, and subtracting from it the 
item, 635.6352, found as the sum of squares between classes. The 
remainder is the sum of squares within classes. We thus obtain the 
analysis of variance of judgment in feet as shown in Table 5. 

The allocation of the degrees of freedom is not difficult to see. 
There are 51 in all, between 52 observers, of which 9 are between 
the 10 class means. The sum of squares within classes thus corre- 
sponds to 42 degrees of freedom, 

The mean squares are found as the ratio of the corresponding 
sums of squares to their numbers of degrees of freedom. Taking the 
mean square within classes as the estimate of error variation, the 
other two mean squares, that ascribable to the four classificatory fac- 
tors and the residuum between classes after fitting constants repre- 
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senting these factors, may be tested for significance by comparison 
with it. The residuum between classes is clearly not significant as 
its mean square is somewhat less than that for error. The other mean 
square, for differences ascribable to the classificatory factors, is next 
divided by the error mean square to give a variance ratio. The vari- 
ance ratio should be entered in a table of variance ratios such as 
those provided by Fisher and Yates (3), or Mather (7). On enter- 
ing in such a table, the item ascribable to the main effects of the four 
factors is found to have a very significantly low probability. The 
four factors, taken together, are having an effect on judgment. But 
since the residuum, which tests for “interaction” of the four factors, 
i.e., for departure from additiveness of their effects, is insignificant, 
they may be regarded as additive in their effects. 

As this residual or “interaction” item is not significant, it may 
be pooled with the item for variation within classes to give a pooled 
estimate of error based on 47 degrees of freedom, the mean square 
being 18.5359. 

Now the standard error of b, is found by multiplying Ci, by 
the square root of the error mean square and so is 18.5359 X 
V0.02853 = + 0.7272. The standard errors of b., b. and ba are 
similarly found, by using \/Co2 , C33 and \/C,, in place of \/C1; , to be 
0.7898, 0.7642, and 0.7731, respectively. Thus b, is less than its stand- 

ba : 
standard error of b, ae 
tg = 1.738. Each t has 47 degrees of freedom, since the error mean 
square was found from this number of independent comparisons, and 
entry in a table of ¢ gives the probability (P) as 





ard error, while ¢, = 


t,, P=0.4—0.3 
t,, P=0.01 — 0.001 
t, P=0.2—0.1 


ts, P=0.1— 0.05. 


Thus the effect ascribable unambiguously to age is clearly sig- 
nificant, that for sex clearly insignificant, and those for education 
and domicile of doubtful significance. 

It may seem remarkable that, while the four factors taken to- 
gether have such a markedly significant effect, as shown by the analy- 
sis of Table 5, only one of them, age, has a significant effect taken by 
itself, The explanation is that ¢,, ¢, , etc., test the effects unambigu- 
ously ascribable to sex, age, etc. There remain, however, significant 
effects not ascribable unambiguously to any one factor, owing to the 
unequal representation of the 16 classes, but which may be traced 
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to the four factors acting together. In the present case this item can 
be shown to represent nearly half of the sum of squares attributed 
to the four factors in Table 5. In other words, the design of the ex- 
periment is such that only half the information available can be in- 
terpreted ; the other half cannot be used to tell us anything about the 
actions of the four factors as individuals. Only by having all 16 
classes represented by equal numbers of observers can this loss of 
information be wholly prevented. It may be added, too, that in such 
a case the analysis would be much less laborious than that which it 
Was necessary to use for the present unbalanced data. 

The conclusions to which the analysis of the data in feet 
have led are changed in no material particular when the converted 
data in log units are used. Since the scores &, --- 2 are assigned in- 
dependently of the actual judgments, the analysis as conducted so far 
as the C matrix is the same for log units as for feet. In log units 


S[ (x, — Z.) (y— 9)]= 87.00 
S[ (x. — Z) (y — 9)] = 187.81 
S[ (xa — Za) (y — 9) ] = 157.88. 


We find 
b, = 1.2154 + 0.9765 t,=1.245, P=0.3-0.2 
b, = 2.8085 + 1.0600 t, = 2.650, P= 0.02-0.01 
b. = 2.2770 + 1.0262 er t, = 2.219 , P=0.05-0.02 
ba= 1.4685 + 1.0382 tz= 1.414, P=0.20-0.10. 


The analysis of variance becomes that given in Table 6, the calcu- 
lations being made exactly as before but with the log data. Again we 
find the interaction mean square to be insignificant (though now 
slightly greater than error). It is pooled with the mean square with- 
in classes to give a pooled estimate of error for 47 degrees of free- 
dom, from which the standard errors of the four b constants are cal- 
culated. As before, the sex effect is insignificant, though the age 
effect is significant. Education effects seem to be clearly significant 
now, though they were not clearly so in terms of the other units, 
while domicile is rather less significant than before. 

Again as before, the item testing the pooled effects of all four 
factors, in the analysis of variance, is much more significant than 
any of the four individual tests would suggest. This time, however, 
the information which is uninterpretable in terms of one or other of 
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the four individual effects is somewhat more than half that contained 
in the items of the analysis of variance. 

We may thus conclude that the four factors taken together 
markedly affect judgment of height, though owing to the nature of 
the data, only age can be shown to have a marked individual effect. 
Since the dividing line between “young” and “old” was taken to be 
48 as a result of inspection of the data themselves, the significance 
of even this factor must be viewed with some suspicion. Sex has no 
significant effect, and education and domicile have suggestive, though 
not unambiguously significant, individual effects. The effects of the 
four factors may be regarded as simply additive. 

The most important result of the analysis is, however, the guid- 
ance that is given us in designing further experiments. It is clear 
that the unequal representation of observers in the 16 classes has 
resulted in much of the information being uninterpretable in terms 
of the four individual effects, to say nothing of its leading to a more 
laborious statistical analysis. It is thus clear that in the future equal 
representation should be aimed at, even if it involves some extra 
trouble, for only in this way can the full value of the experimental 
procedure be realized. 


IV. Discussion 


As stated above, age as the strongest factor may be partly an 
artifact because here classification was based on inspection of the 
data alone. Furthermore, while the other two effective factors are 
environmentally determined, age is a physiological factor, and it is 
not possible here to go into the physiology of perception. Sex, on the 
other hand, was found to have no real effect. Therefore we shall limit 
the discussion to the suggestive, though not unambiguously signifi- 
cant, effects of education and domicile. 

1. Relationship to intelligence of the results from the present 
study and those from constancy experiments. The suspiciously effective 
factors of education and domicile—and incidentally age as well—have 
in common that they correlate with scores on intelligence tests, as 
has been shown in numerous studies. In addition, education and 
domicile in our sample coincided largely with occupation, a further 
factor known to be related to test intelligence. Thus, judgment of 
size would be related to intelligence, lower judgments being found 
with higher intelligence, i.e., educated, urban subjects. 

In several laboratory constancy experiments also, those with 
higher test intelligence, or the educated European-Americans, were 
found to perceive things as smaller than the less intelligent, or other 
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ethnic groups. These experiments are: (a) Thouless (9) with 53 sub- 
jects found a correlation of —.41 + 0.01 between intelligence, as 
measured by the Cattell test and the NIIP, and size and shape con- 
stancy; (b) Klimpfinger (5) found a size constancy ratio of 95.8 for 
20 students, which compares to a ratio of 99.7 found previously by 
another investigator for rural hired men and women; (c) Thouless 
(10) in another study, found a size constancy ratio of 0.61 for 49 
English students and one of 0.76 for 20 Indian students, with a criti- 
cal ratio of 4.3; (d) Beveridge (1) found a ratio of 0.75 for 8 Euro- 
peans and one of 0.88 for 44 West African natives, students of draw- 
ing at a Presbyterian training college: (e) Sheehan (8) using 25 
young college women found a correlation of —0.375 between size con- 
stancy and scores on the Street Gestalt Completion Test which “meas- 
ures a non-verbal aspect of mental organization.” 

While our findings agree with the constancy results regarding the 
inverse relationship between height of judgment and intelligence, they 
disagree regarding accuracy of judgment, which in our experiment 
was positively related to intelligence. The following is an attempt to 
explain this discrepancy. 

2. Effect of the experimental situation. In the constancy ex- 
periments, the individual who makes a high score (great accuracy) 
is said to have an objective attitude, that of naive observation, and 
the one who makes a low score (less accuracy) is said to have a sub- 
jective or analytical attitude, the latter being found with greater 
intelligence. 

Our findings and observations would confirm the existence of 
these two attitudes and their effect on accuracy of judgment, but not 
their dependency on intelligence, nor their effect on height of judg- 
ment. Both these relationships are determined by the experimental 
situation, as follows: 

In the constancy experiments it is the less intelligent who assume 
the naive attitude because a laboratory situation is unfamiliar, means 
nothing to the primitive, the child, the less intelligent; it is not part 
of his world. Thus he yields to the impression without further re- 
flection, always the better way to respond to tasks of this kind. To 
the more intelligent, however, it may be an intellectual challenge, re- 
sulting in the analytical attitude with consequent greater deviation 
from the actual size. 

Our situation, on the other hand, offered a greater challenge to 
the less intelligent. The strange thing in the landscape meant nothing 
to the more educated professional city people who thus yielded to the 
impression without much reflection, in the naive attitude. They ap- 


proached the task with a certain indifference. But to the less educated 
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rural people the task was a challenge which they took up with inter- 
est and confidence in their experience. This resulted in the analyti- 
cal, subjective attitude. To judge the size of an object in the land- 
scape was part of their practical, everyday lives; the object might be 
something in relation to their work. Thus their level of aspiration 
was high, whereas that of the city people was low. Aside from gen- 
eral observations, these considerations are supported by one extreme 
case. The lowest judgment among the rural people by far was made 
by a happy-go-lucky, unsettled young woman of the lowest rural socio- 
economic level. Being completely indifferent, she was not challenged 
by the task. 

The experimental situation also determines the direction of the 
error. In our experiment the 7-ft. board in its setting could not well 
be underestimated more than 3-4 feet, whereas there was no limit to 
overestimation. Thus those who tended to larger error found their 
outlet on the upper side. In the constancy experiments the percep- 
tual laws underlying the constancy phenomenon and the experimental 
procedure favor the error of underestimation. 

The discrepancy stated at the end of section IV, 1 finds its reso- 
lution then in the realization that intelligence is not the decisive fac- 
tor in the results of these two types of experiment, but rather attitude. 
Attitude will be determined by the experimental situation, which also 
determines the direction of the error. 

3. Theory. On the basis of numerous animal experiments and 
the human constancy experiments cited above, Locke (6) proposed 
as a general theory, to be valid within the human species as well, that 
perceptual ability decreases as a given phylum becomes more complex 
and its intelligence increases. This would be because intelligence dis- 
places perception as the principal means of coping with the environ- 
ment. 
Our considerations would disclaim the validity of this theory for 
the human species. The apparent “perceptual ability” of the less in- 
telligent in the human constancy experiments is probably only the re- 
sult of the laboratory situation if in an actual outdoor situation the 
more intelligent show so much more “perceptual ability.” Further- 
more, Locke’s theory involves the assumption that perceptual con- 
stancy is a unitary factor. But this assumption was not verified in a 
comprehensive study by Sheehan (8) designed to settle this problem. 

On the basis of the present results and considerations we would 
conclude that at the human level perceptual function, like so many 
others, is determined largely by the individual’s background and in- 
terests, his social and individual personality, his level of aspiration— 


but not by his intelligence. 
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V. Summary and Conclusions 
Fifty-two observers, heterogeneous as to sex, age, education and 
domicile, judged the size of an orange-colored upright board that was 
8 inches wide and 7 feet high, erected at a distance of 1000 feet from 
the observer in a landscape of grass and shrubbery. The following 
results were found: 


1. The sample as a whole overestimated the size by 23% [mean 
judgment 8.6 feet (reconverted), range 3 to 30 feet]. 

2. The 17 more educated subjects underestimated the size by 
16%, which is similar to size-underestimation of at least 17% by stu- 
dents in a constancy experiment. Yet this similarity appears to be a 
mere coincidence. : 

3. Preliminary analysis of the data, whether in feet or trans- 
ferred into log units, showed that lower judgments were made by the 
women, the young, the more educated and the urban, as compared 
with the men, the old, the less educated and the rural people, the sex 
difference, however, being insignificant. 

4. More rigorous analysis showed that, while taken together the 
four factors markedly affected judgment, (a) sex had no real individ- 
ual effect, (b) age had the biggest individual effect but possibly a 
spurious one due to the treatment of the data, (c) education and 
domicile had suspiciously large individual effects, (d) the four factors 
were simply additive in effect. 

5. In the present experiment higher judgment meant greater 
error. Thus greater error was found with lower education and rural 
domicile, both factors known to be related to lower test intelligence. 
This is in disagreement with (a) the results of certain constancy ex- 
periments where greater accuracy was found with lower intelligence 
and (b) a general theory proposed by Locke (6) on the basis of these 
constancy results, according to which perceptual efficiency at the hu- 
man level as well as on the sub-human level is inversely related to 
intelligence. 

6. While thus intelligence cannot be regarded as the factor in- 
fluencing accuracy of judgment in both our experiment and the con- 
stancy experiments, attitude can be regarded as the common factor. 
The naive attitude would seem to make for greater accuracy of judg- 
ment in both situations—it being present among the less intelligent 
in the constancy experiments and among the more intelligent in our 
experiment as determined by the experimental situation. 
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THE RELIABILITY OF COMPONENT SCORES 


FREDERICK B. DAVIS 


COOPERATIVE TEST SERVICE OF THE AMERICAN COUNCIL ON EDUCATION* 


A method is given for determining the reliability of each of the 
we resulting from a factor analysis by the principal axis 
me ; 


Determination of the reliability coefficient of scores in each of 
the principal components derived from a factorial analysis is desir- 
able for several reasons. It is obvious that the naming and interpre- 
tation of components the variances of which cannot be proved greater 
than might be yielded by chance alone may be misleading and decep- 
tive. Consequently, Hotelling and Thurstone long ago considered this 
problem, and more recently Hoel} described a precise test for deter- 
mining the number of significant components that could be obtained 
by factorial analysis from a given matrix. 

A somewhat different approach to determining the statistical sig- 
nificance of components has been suggested by Kelley.t First, he has 
provided a variance-ratio test of the uniqueness of principal-axis com- 
ponents as they exist at any stage of the Kelley iterative process for 
their determination.§ By means of this test it is possible to ascertain 
the likelihood that a given component will re-occur in subsequent sam- 
ples by discovering whether the magnitudes of the variances of suc- 
cessive components are significantly different. Second, he has sug- 
gested that whether the variance of a principal component is greater 
than might be yielded by chance can be determined by noting whether 
the reliability coefficient of scores in the component is significantly 
greater than zero. This follows, of course, from the fact that the non- 
chance variance of a variable is directly proportional to the reliability 
coefficient of the variable. 

It might be supposed that the magnitudes of the reliability co- 
efficients of the principal components obtained from any given matrix 


* On leave for military service. 
P. G. Hoel. A significance test for component analysis. Annals of Math. 

Stat., 8, 149-158. 

t The writer is indebted to Dr. T. L. Kelley for pointing out the importance 
of this problem and outlining the method of obtaining component reliabilities. 

§ T. L. Kelley. A variance-ratio test of the uniqueness of principal-axis com- 
ponents as they exist at any stage of the Kelley iterative process for their de- 
termination. Psychometrika, 1944, 9, 199-200. 
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would be directly proportional to the magnitudes of the component 
variances, but that is not mathematically necessary and has not 
proved to be the case in actual practice. The writer has obtained re- 
liability coefficients for the principal-axis components derived from 
two entirely different matrices. Computation of the coefficients for 
nine components derived from the first matrix was done by an empiri- 
cal procedure which is very laborious ;* computation of the coefficients 
for fourteen components derived from the second matrix was done by 
specific application of the formulas presented later in this article. 
In both cases, there was the expected tendency for the larger compo- 
nents to have the higher reliability coefficients but the agreement was 
far from perfect. In the set of nine components, the first, second, 
third, seventh, and eighth were found to have reliability coefficients 
sufficiently greater than zero to warrant the belief that their vari- 
ances were significant. In the set of fourteen components, the reli- 
ability coefficients of the initial variables were based on a far larger 
sample and only the ninth and fourteenth components had reliability 
coefficients sufficiently low as to warrant discarding the components 
in interpreting the results of the study. 

A test of the sort described by Hoel indicates only the number 
of significant components that may be obtained, while if the reliabil- 
ity coefficient of each component has been computed, the components 
the variances of which are not significantly greater than zero may be 
identified and discarded. Furthermore, if individual scores in the 
useful components are to be obtained, the standard error of measure- 
ment of obtained component scores may easily be estimated. Since 
the reliability coefficients of most of the components obtained from a 
given matrix are likely to be low in comparison with the reliability 
coefficients of the majority of achievement and aptitude tests, their 
significance is an important consideration, the determination of which 
justifies the expenditure of considerable labor. In fact, the writer will 
go so far as to say that if a factorial analysis is worth doing, it is 
worth the labor required to obtain the reliability coefficients of the 
components found. 

Let the first component of a matrix containg initial variables 
be denoted as 


C,= 6,2, + Dots +++ + Opry ’ (1) 
where b,, b.,--- , b» are regression coefficients derived from a matrix 
by a principal-axis method, and 2, , x2, ---, 4, are scores in the initial 


variables expressed as deviations from their respective means. 


* Frederick B. Davis. Fundamental factors of comprehension in reading. 
Psychometrika, 1944, 9, 185-197. 
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If we let x, and x, represent equivalent halves of x,, %» and x, 
represent equivalent halves of x. , etc., then C, and C, must. represent 
equivalent halves of C; and may be denoted as 





Ca = 01 2q + Doty +-++ + Duty, (2) 
C, = bi24 + bot, os bac + Duty . (3) 

Also, 
5 ae Va," + oa” a 2oar4Naa . : (4) 


Since a, and %,, 2 and 2g, --- , %, and “y are equivalent halves, 
equation (4) may be written as 








= V20? : 20071 aa . (5) 
Solving for o, , we obtain 
C1 
eas - (6) 
v2 2 # Tas) 


The fact that x, and x4, x) and wz, ---, %, and “xy are equivalent 
halves also leads to the fact that all possible intercorrelations of the 
halves of successive pairs of initial variables are equal to a very close 
approximation; that is, 


Tab = Tos = Tad = TB (7) 
Tac = Vac = Vac = Tac (8) 
T mn = Y'mNn = Tun = TuN - (9) 


Since the intercorrelations of the initial variables are known, it 
is possible to secure numerical values for the intercorrelations of the 


halves. 


me ihe D(H. + a) (Lp + Hz) (10) 


(@q+Pa) (Lot+%B) v= (2, are La)? v> (x + Xp)? 
If equation (10) is expanded and simplified by making use of 
the identities in (7), (8), and (9), 

















2% a 
V5 = : (11) 
vil + Taa vil + Tor 
Solving equation (11) for r.., we obtain 
Y2V14+%en VI + 7 
ee 2V aV be (12) 
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By means of equations analogous to (12) all the desired inter- 
correlations of halves of the initial variables may be obtained. 
Returning now to equations (2) and (3) , the correlation of oer 
is, by definition, 
3C.C 
r= mh (13) 
VEC VeCr 


Since C. and C,, are equivalent, equation (13) may be written 
me 








i = : 14 
CaCa > C.2 ( ) 
Substituting in equation (14), we obtain 
Dia + Dex, +--+ + Dan) (O10, + Deep +--+ + Dnt 
a ‘ ) (bray + dats i as 





"outa E (ite + Doty +--+ + Daas)? 
Expanding and simplifying equation (15), we may write 
aha by2007%a4 + bo704?1 oR + ++ + Dn2on? ny + 2b De0g00% an + 20.0s000c? ac 
oats ha? + be +--+ Boat 





(16) 
tee 20 mOnomonl mn 
+ 2b,b.0,057 + 2b, bs0n0 ac + bt + 20 mBnomont mn : 


With the exception of the terms written ra1, Tos, °*: » Taw, the 
method of obtaining numerical values for all the terms in equation 
(16) has been mentioned. The terms 74, 12, °** , nv are the correla- 
tions of one half of each initial variable with the other half. They may 
best be obtained directly, although they can be estimated by means 
of the Spearman-Brown formula from the reliability of each entire 
initial variable. 

After a numerical value has been obtained for ae , the reliabil- 
ity coefficient for component I may be estimated ra means of the 
Spearman-Brown formula. 

Using the same correlation coefficients and the same standard 
deviations of the halves of the initial variables together with the ap- 
propriate different sets of b’s, one may estimate the reliability co- 
efficient of each other principal component in a similar manner. 
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AN I1.B.M. TECHNIQUE FOR THE COMPUTATION 
OF SX? AND SXY* 


KuRT BENJAMIN 
GRADUATE RECORD EXAMINATION 


Given I.B.M. cards punched with scores (or any numbers)—but 
not their squares—a method is presented of tabulating them (on the 
No. 405 alphameric I.B.M. tabulator) so as to obtain the sum of 
squares. The technique is also adaptable to summation of cross- 
products. The principle is an extension of the Mendenhall-Warren- 
Hollerith technique of vertical progressive digiting, without the ne- 
cessity of manual addition or summary-punching, and is designed 
for machines not equipped with the “card cycle total transfer” de- 
vice or “progressive total’ device. Use is made of “counter rolling.” 
Efficient use of machine capacity is made only when intercorrelations 
between no more than two variables are required in addition to sums 
of hee A resumé of some techniques now commonly employed 
is included. 


To obtain a sum of squares, the following are some of the I.B.M. 
methods now in common use, when detail cards contain scores (or 
any numbers), but not their squares: 


A. Selecting one master squares card for every detail card. This re- 
quires a collator and a previously prepared master file (which may, 
however, also contain other data—such as higher powers). Matched 
masters are subsequently tabulated, and then re-merged with the file 


(8). 

B. Interspersed master gang-punching. This also requires prepared 
squares deck, use of gang-punch machine, and subsequent tabulation 
of detail cards. 


C. Use of automatic multiplying punch. Summary-products counter 
will contain sum of squares; or, squares can be punched into details 


and these tabulated (3). 


D. Horizontal digiting, requiring at least one digit selector, and 
three counter-groups per variable. This method also requires multi- 
Plication and addition of totals (5). 


E. Mendenhall-Warren-Hollerith correlation method: printing of 


* The author is indebted to Dr. Paul Dwyer, Associate Professor of Mathe- 
matics, University of Michigan, for valuable criticism of the original draft; and 
to Mr. Alan Meacham, in charge of the University’s Tabulating Station, for test- 
ing the method. 
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vertically progressively digited totals—from highest to lowest score. 
This requires “progressive total” device, manual addition after tabu- 
lation and allowance for gaps in the distribution (1, 2, 9). 


F. Summary- punching vertically progressively digited totals. This 
requires digit cards (for possible gaps in the distribution) , summary- 
punch in conjunction with tabulator (equipped with “progressive to- 
tal” device), and subsequent tabulation of summary cards. 


G. Transfer of vertically progressively digited totals to another 
counter (which will finally contain the sum of squares) by use of 
“card cycle total transfer” device. Digit cards are required, but no 
“progressive total” device (6). 

Sorting is required in all cases, except (C) and (D). 

The method here described is similar to (G), but designed for 
machines not equipped with “card cycle total transfer” device. It is 
based on the same mathematical principle as (E), (F),-and (G) (see 
4). A detailed explanation of the wiring, with explanatory notes, is 
given in an Appendix. 


Necessary Equipment 

a) Sorter; b) alphameric tabulator No. 405 (I.B.M.), equipped 
with at least 2 class selectors, and a number of independent “X’’-dis- 
tributors equal to the number of digits expected in the sum of scores 
(5 were allowed for in the appended wiring directions). “D” pick-up 
hubs must be operative; but neither “progressive total’? device nor 
digit selector is needed. c) 2 digit cards for every possible score from 
the highest actual down to 1 (there should be none for zero scores), 
with scores punched in the same fields used in detail cards. Digit cards 
should contain an identifying punch, and be sorted behind details, all 
in order from highest to lowest score. 

For example, assume the following scores: 


STUDENT NO: I II WI IV V VI VII VIM IX X XI 
SCORE: 8 SB. @. 319 oe, VR Oa a 


The order of cards will be: 


Detail card with 12 in columns 7-8 
“cc “ 6é 12 6 6“ 7-8 


Digit “ “ 12“ “ 7-8 and “X” in column 80 
66 6 ‘és 12 “cc sé y | ae é 6“ “cc 80 
a. a *. 
‘6 “cc “é 11 ‘é “cc 7-8 


‘“ é sé 11 “ce “cc 7-8 
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Detail card with 11 in columns 7-8 and “x” in column 80 

Digit 6eé be 11 6 6 18 6 i3 6 6é 80 
6é 6é “cc 1l 6 6é 7-8 6é “ éé 6“ 80 
Ti 6é 6é 10 “ 6é 7-8 (i 3 6é “ ii 80 
6c 6c“ ae 10 “cc “ce 7-8 6é 6s 6 6“ 80 

Detail “ “« 09 “ - 7-3 
6“ Ti f3 09 “ 6é 78 

Digit ce“ oe 09 6é ce 7-8 ad oe ec “ce 80 
6c 6c 6“ 09 ce 6“ 71-8 “ “a 6s ‘“ 80 

Detail “ eo - 7-8 

Digit 6“ 6é 08 ‘“ 6c 7-8 6é 6c 6“ 6“ 80 
6é “ce (ii 08 ‘ec ce 7-8 ce 6é 6é éeé 80 
fe 66 écc 07 6é 6“ 7-8 6é 6é 6 6 80 
6é 6é ‘cc 07 “ “se 7-8 “ec cs (ii be 80 
6c 6é éé 06 “sé 6“ 7-8 “ce 6é ec “ 80 
“cc 6eé éé 06 “cc ce 7-8 sé ce ce be 80 
sé 6c “ce 05 66 6c 7-8 és 66 6s ‘“ 80 
“ce be “cc 05 ae 6 7-8 6eé 6é eh ce 80 

Detail &“é “cc 04 6“ 6é 7-8 

Digit sé Cf 3 04 ce oe 7-8 ce ee ce be 80 
6é 6“ 6c 04 é“c 6“ 7-8 é“ éé aa 6e 80 
6 “ “cc 03 “cc ce 7-8 zi “ce 6é “cc 80 
“a cc 6c 03 6 rT% 7-8 ““ 6s cé “ 80 
6é “ce ee 02 ee ce 7-8 sc 6é (ii 6s 80 
6é éeé cc 02 “cc ee Bay 5 ce ec ee ‘“c 80 
“cc 6é 6é 01 iii ii3 7-8 ss &“ 6é “cc 80 
6c“ &é 6c Ol éé “cc 7-8 sé sé ce ce 80 


Detail 6“ 6c 00 oe 6c 7-8 


Machine Principle 

After all detail cards containing a particular score have passed 
the add brushes, the total of scores accumulated in a counter group 
(hereafter referred to as the transmitting counter) is transferred to 
another counter group (henceforth referred to as the receiving coun- 
ter), while the first digit card of that score group passes the lower 
brushes. As the second digit card passes the add brushes, the figure 
in the transmitting counter is restored to the total it contained prior 
to transfer. Thus, the receiving counter will finally contain a sum of 
the progressive totals, i.e., the sum of squares (digit cards take care 
of possible gaps in the distribution). The transmitting counter will 
print the final progressive total, i.e., the sum of scores. 
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It is obvious that this technique is easily adapted to summation 
of cross-products by transferring the progressive totals of one vari- 
able at each change in the score of the other variable. 

A permanent file of 18 (twice 9) digit cards, each punched with 
its particular digit in all card fields, would obviate digit card prepa- 
ration for each job, if control is on one column at a time. This method 
can be used with a multiple digit score if the whole score is added each 
time—partial products method. It is then necessary to multiply the 
totals of the 10’s position tabulation by 10, of the 100’s position by 
100, ete., and add them to the units position total. 


Limitations 
In view of the large number of selectors required and the rela- 
tively inefficient utilization of counter capacity, this method is rec- 
ommended only when sums of squares, but no intercorrelations, are 
needed, or when the latter are confined to two variables. 
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APPENDIX 


WIRING DIRECTIONS: (Assume two-digit scores in columns 7-8, and an “X” 
in column 80 of digit cards. Class selector hubs have been assigned consecutive 


numbers 1-10, reading from left to right) : 


From control brush 80 to “X” pick-up of class selector D. 

From “X” pick-up of selector D to common position 1 of selector D. 

From controlled position 1 of selector D to common position 10 of selector C. 

From normal position 10 of selector C to “X” pick-up of selector C. 

From add brushes 7-8 to normal positions 9-10 of selector D. 

From common positions 5-10 of selector D to entry hubs of counter 6B(19-24): 

From common positions 1-6 of selector C to controlled positions 5-10 of selector D. 

From “subtract units position control” to normal positions 1-6 of selector C. 

From controlled positions of X-distributors 1-5 to controlled positions 1-5 of se- 
lector C. 

From card count to common position of X-distributors 1-5. 

From unequal impulse outlets 1-5 (comparing unit) to “D” pick-ups of X-dis- 
tributors 1-5. 

From “hot nine” hub of any counter(s) (hub to left of S.U.P. entry) to upper 
comparing relays 1-5. 

From “hot nine” hub of counter 6B to S.U.P. entry of counter 6B. 

From “hot nine” hub of counter 6D to S.U.P. entry of counter 6D. 

From “CI” hub of counter 6B to “C” hub of counter 6B. 

From “CI” hub of counter 6D to “C” hub of counter 6D. 

From counter total exit 6B(19-24) to counter entry 6D(59-64) and to lower 
typebars (nine’s complement of =X). 
From counter total exit 6B (positions 20-24 only) to lower comparing relays 
1-5. 

From counter total exit 6D (59-64) to lower typebars (=X°). 

From “plug to ‘C’” impulse hub to common position 9 of selector C. 

From controlled position 9 of selector C to —-(minus) of counter 6B. 

From normal position 9 of selector C to common position 2 of selector D. 

From normal position 2 of selector D to —(minus) of counter 6B. 

From controlled position 2 of selector D to + (plus) of counter 6B. 

From + (plus) of counter 6B to —(minus) of counter 6D. 

From “minor class of total” to “class of total control” of counter 6B. 

From “intermediate class of total” to “class of tot. contr.” of 6D. 


(The two counter groups must be reset on different total cycles. Otherwise the 
balance test and selector “total” impulses preceding total cycles will prevent the 
left-most position of 6D from resetting, whenever the equivalent position of coun- 
ter 6B contains a 9—as it will, in view of the use of complements). Automatic 
checks for the presence and correct position of digit cards can easily be devised. 


EXPLANATORY NOTES: It will have been observed in the above wiring di- 
rections that, in order to transfer the total from one counter to another, 10 
(S.U.P.C. = 10) was added to the number in each position of the transmitting 
counter, thus causing 1’s to carry-over. The second digit card allows time for 
correction of the carry-over, thus restoring the correct total. However, whenever 
a 9 stands in a counter position and a 10 is added in that position and the posi- 
tion to its right, the expected carry-over of 2 into the position to the left of the 
first-mentioned does not occur, since the machine will carry only 1 on any one 
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cycle. This left position must therefore not be corrected for carry-over. This is 
the reason for the comparing unit plugging; e.g.: 




















hypothetical counter total 0 3 9 6 before transfer. 

first digit card 10 | 10 | 10 | 10 (S.U.P.C. for transfer). 
counter total after transfer: 1 4 0 6 NOT 1506 

second digit card: 1; -{| 1] - (—1 after transfer) 
after correction 0| 3] 9] 6 = as before transfer. 





If a “1” had also been deducted in the 100’s position, an incorrect total would 
have been obtained (units position is never corrected, since 100,000’s position, 
from which it receives its carry-over—‘“C.I.” to “C”— should always be at 9: 
vide below for reason for use of complements). 

Since it is necessary that the transmitting counter add, and the receiving 
counter subtract, during transfer, the scores also were subtracted into the trans- 
mitting counter, so that the sum of squares would be a true figure (complement 
of a complement). The sum of scores will be a nine’s complement, unless printed 
from an independently cumulating counter group. Use of ten’s complements en- 
tails sacrifice of one receiving counter position. 

NOTE: Progressive totals can be “counter-listed” from receiving counter 
during transfer in alphameric typebars (numeric typebars would print symbol 
whenever a 9 is being transferred). 
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TOTALS STANDING IN BOTH COUNTER-GROUPS, after each card has 
passed the add brushes, assuming hypothetical scores given in body of article: 


COUNTER 6 D 


CARD NO. 


1st 
2nd 
3rd 
4th 
5th 
6th 
7th 
8th 
9th 
10th 
11th 
12th 
13th 
14th 
15th 
16th 
17th 
18th 
19th 
20th 
21st 
22nd 
23rd 
24th 
25th 
26th 
27th 
28th 
29th 
380th 
31st 
32nd 
33rd 
34th 
35th 


COLS.7-8 COL. 80 COUNTER 6B 


(transmitting) 

12 —_ 999987 
12 — 999975 
12 x 000086 
12 x 999975 
11 — 999964 
11 —- 999953 
11 —~ 999942 
11 — 999931 
11 x 060042 
11 x 999931 
10 x 000042 
10 x 999931 
09 — 999922 
09 —_ 999913 
09 x 000024 
09 x 999913 
08 a 999905 
08 x 000016 
08 x 999905 
07 x 000016 
07 x 999905 
06 x 000016 
06 x 999905 
05 x 000016 
05 x 999905 
04 999901 
04 x 000012 
04 x 999901 
03 x 000012 
03 x 999901 
02 x 000012 
02 x 999901 
01 x 000012 
O1 x 999901 
00 = 999901 
Nine’s complement of 2X: 999901 

2X: 98 





(receiving) 


000000 
000000 
000024 
000024 


























THURSTONE, L. L. A Factorial Study of Perception. Chicago: University of 
Chicago Press, 1944. Pp. vi + 148. 


A REVIEW 


This imposing study sets forth the Pearson product-moment correlations 
among 60 selected variables, and reports a multiple-factor analysis of 43 of these 
variables. The 17 scores excluded from the factor analysis were judged free from 
significant common-factor variance by reason of their insignificant correlations 
with other variables in the battery. Most of the tests retained for the factor 
analysis are tests of visual perception; some of the tests are new. In selecting 
the perceptual measures, preference was given “to those perceptual effects which 
. .. might conceivably be centrally [rather than peripherally] determined” (p. 1). 
The subjects for the study consisted mostly of University of Chicago undergradu- 
ate volunteers, the typical correlation being based on about 170 cases. The tests 
were given in four sessions—three sessions for the individual laboratory tests, 
and one for the group tests. 

Among the perceptual tests employed were such as the following: Street 
gestalt completion; peripheral visual span; hidden digits; autokinetic movement; 
flicker fusion; spiral aftermovement; Miiller-Lyer illusion; size-weight illusion; 
Schmidt color-form ratio; color-form sorting; retinal rivalry reversals; shape con- 
stancy; and hidden pictures. The battery, however, was not limited to perceptual 
tests alone. Tests which the present reviewer would classify as partly perceptual 
and partly intellectual are the Gottschaldt figures, the Kohs Block Designs, and 
the Space tests of the Primary Mental Abilities battery. Purely intellectual vari- 
ables included scores V, N, W, and R from the Primary Mental Abilities battery. 
In addition, the battery of 60 variables included tests of which the following may 
be selected as representative: reaction time to scund; speed of dark adaptation; 
speed of judgment; visual-motor coordination (two-hand); free association to 
selected verbal stimuli; and a modification of mirror drawing. 

Specifically excluded from the main study were all direct measurements of 
personality (such as the Guilford personality schedule). It is the author’s avowed 
goal, eventually, to obtain measures of personality through the perceptual tests. 
Two major advantages of such perceptual measures of personality would be ob- 
jectivity and freedom from deception—both deliberate deception and unconscious 
self-deception. Underlying the hope that objective perceptual measures may 
yield measures of personality is the broad theory of interdependence of all func- 
tions of the individual. According to Thurstone, “it would be difficult to main- 
tain that any of these functions, such as perception, is isolated from the rest of 
the dynamical system that constitutes the person” (p. 3). In passing, the re- 
viewer must confess that he finds no difficulty at all in maintaining that many 
independent or “isolated” functions do exist. For example: Visual or tactual 
perception is basic to (say) reading, and reading is practically essential both to 
learning and the attitudes that learning can promote. But vision may be excel- 
lent and reading quite copious—yet learning may be very poor, and attitudes 
remain nearly at the level of the animal kingdom. The evidence from correla- 
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tional studies suggests that the philosophical holism of the gestaltists is a dogma, 
and the “integration” of the mental hygienists only a hope or ideal. Indirect 
evaluation of personality—whether by such measures as perceptual tests, the in- 
terpretation of Rorschach scores, or whatnot—may be possible; but the burden 
of proof falls heavily on the advocates of such indirect measures. Thurstone 
appears to appreciate this when he remarks, “Such an approach as here repre- 
sented is very much of a gamble” (p. 2). 

The factor analysis of the 43 selected variables was carried out by Thur- 
stone’s multiple-factor technique; the rotational problem was handled by plotting 
the graphs for paired columns of the factor-matrix on the printing tabulating 
machine, using a method devised by Ledyard Tucker. Most of the correlations 
among the final factors are near zerc; none exceeds .25 (p. 123). Very briefly, 
the 11 factors extracted were interpreted by Thurstone as representing: 

A. Facility and firmness in perceptual closure (shape constancy, Gottschaldt 
figures, Spatial score (from Primary Mental Ability battery) ). 

B. Error in response to optical illusions (Miiller-Lyer, Poggendorf, etc.). 

C. Reaction time (light and sound). 

D. Perceptual oscillation (retinal rivalry, Necker cube). 

E. Freedom from Gestaltbindung, flexibility in manipulating several more 
or less irrelevant or conflicting configurations simultaneously or in succession 
(two-hand coordination, hidden pictures, Reasoning (from Primary Mental Abil- 
ities battery). 

F. Speed of perception (perceptual span, dark adaptation, Gottschaldt 
figures, Street gestalt completion). 

G. A second-order general factor common to the composite scores on the 
Primary Mental Abilities. 

H. A factor reflecting performance on the Schmidt apparatus for differ- 
entiating between form- and color-dominance. 

J. Speed of judgment (color-form sorting time; social judgments time). 

K. Rorschach Test scores (total number of responses; and tendency toward 
“whole” responses). 

L. An unidentified residual factor. 

It is obvious that some of these factors represent an addition to our under- 
standing of the perceptual field. The reviewer would like to qualify Factor F to 
the extent of suggesting that this factor seems, for the most part, to reflect speed 
in perception based on minimal cues. With regard to the findings on speed, 
Thurstone remarks that the results “point to the desirability of investigating 
speed in distinguishable functions, rather than speed as a generalized factor” 
(p. 119). It appears of some special interest that “. . . the well-known optical 
illusions do have a factor in common, in spite of the fact that their intercorrela- 
tions are on the whole rather low” (p. 120). Although the Primary Mental Abil- 
ities are in general unrelated to the perceptual factors, the Space factor is def- 
initely related to Factor A (loading of .54) and the Reasoning factor to Factor 
F (loading of .55). 

An important finding relates to color-form dominance. The two tests which 
appear to be the best measure of this function are (a) the Schmidt color-form 
ratio, and (b) color-form sorting (variables 16 and 34, respectively). The prod- 
uct-moment correlation between these two variables is —.013. As Thurstone puts 
it, “Color and form dominance did not emerge as a distinguishable category” 
(p. iv). The suggestion is made that surface texture may be a quality requiring 
special consideration in the study of the color-form problem (pp. 116, 122). 
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The Rorschach Test does’ not draw favorable comment in this study. Prob- 
ably those inclined to defend the Rorschach would say that Rorschach “scores” do 
not belong in a correlational analysis. The complete configuration of scores must 
first be submitted to “interpretation” according to Rorschach principles. These 
interpretations may then be included in the list of variables (provided, of course, 
that the interpretations are quantitatively stated); but the original Rorschach 
scores taken alone are virtually meaningless. 

The last chapter of the monograph reports results from the application of 
perceptual and other tests to four special groups: a University of Chicago group 
of fast vs. slow readers; a University of Chicago group of campus leaders; a 
Washington, D.C., group of interns in public administration, who had been rated 
on professional promise and success; a group of Washington, D.C., administra- 
tors classified as (a) analysts vs. (b) personnel men; and finally, the same group 
of adminstrators classified according to salary (adjusted for age). Space is lack- 
ing for comment on these studies, except to say that some reliable test-differ- 
entiations were achieved, and leads for further investigation noted. These studies 
of special groups were exploratory and are only very briefly reported. In the 
study of these special groups, the statistical procedure consisted of preparing 
four-fold tables and determining chi-square. It goes hard with this reviewer to 
see quantitative data, such as test-scores or salaries, reduced to two categories 
only. 

One statistical comment applies to the whole monograph: not a single reli- 
ability coefficient is reported, either for test-scores or factor-scores. This appears 
rather undesirable even in a “frankly exploratory” study (p. 101), particularly 
since some of the tests are new, or have been modified in presentation or scoring. 

A study on a scale so large as this one involves a heavy burden of appa- 
ratus-construction, test administration, test scoring, statistical analysis, and pro- 
fessional cooperation. Thurstone’s indebtedness to his colleagues and assistants 
is appropriately recorded both in the Preface and throughout the text. 

In summary, this monograph makes a significant contribution both in the 
presentation of new tests, and in the study of interrelations among selected vari- 
ables centering about perception. New understanding has been gained of the 
perceptual field. In addition, a series of studies has been executed, exploring the 
value of perceptual and other tests in the differentiation of important special 
groups. These contributions are more than sufficient to make this monograph a 
landmark of current psychological progress. 

HERBERT S. CONRAD 


College Entrance Examination Board 
Princeton, N. J. 
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